subject:"Denormalisation, magic, and is it really that useful\?"

Re: Denormalisation, magic, and is it really that useful?

2008-09-26 Thread Alex Koshelev

Hi, Erik.

The main purpose is to have declarative form of composition field
calculation definition. Not to write imperative actions/signal
connection/etc.

I've made only flexible generic solution. In future I plan to write
high-level subclasses that will can with minimal input parameters make
all work very comfortable for developers.
And then there will be no need to write all denormalisation
functionality once again and again, from project to project.

On Sep 26, 8:56 pm, Erik Allik <[EMAIL PROTECTED]> wrote:
> I was just wondering.. Why all this abstraction?
>
> Why do we need a separate field for denormalization? Can't we just use  
> regular fields and simply set up denormalization in a procedural way  
> in the constructor? All that needs to be done to create a denormalized  
> field is connect a few signals maybe or do some expensive compuation  
> after modifying a field. But maybe I'm missing something crucial.
>
> Erik
>
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-26 Thread Erik Allik


I was just wondering.. Why all this abstraction?

Why do we need a separate field for denormalization? Can't we just use  
regular fields and simply set up denormalization in a procedural way  
in the constructor? All that needs to be done to create a denormalized  
field is connect a few signals maybe or do some expensive compuation  
after modifying a field. But maybe I'm missing something crucial.

Erik

On 26.09.2008, at 18:45, Alex Koshelev wrote:

>
> Hi, guys!
>
> For a long time I have thoughts to make composition/denormalisation a
> little bit easier and reusable. But I have no time to implement my
> ideas.
> Inspired by Anrew's blog post and its thread I recently wrote some
> code. And I think that it really may be useful and cover most of
> denormalisation use cases in Django.
>
> And this is an example of Anrew's code but with my CompositionField:
>
> class Picture(models.model):
>user = models.ForeignKey(User)
>user_username = CompositionField(
>  native=models.CharField(max_length=150),
>  trigger=dict(
>  sender_model=User,
>  field_holder_getter=lambda user:
> user.picture_set.all(),
>  do=lambda picture, user, signal:
> user.username
>)
> )
>image = models.ImageField(upload_to="foo")
>title = models.CharField(max_length=100)
>
> Its more verbose but very generic and can be customized. And free
> bonus - auto-generated `update_user_username` method.
>
> The code available here [1]. Its very generic but allows to write more
> high-level subclasses with some introspection.
>
> Today I've written the blog post about it and my future suggestions.
> It is in Russian[2], here is a google-translated version [3] (but with
> broken code blocks) that I think can help to understand some concepts.
> And it holds more usage examples.
>
> I have many new ideas of how my solution can be improved and I what to
> hear you opinions about it.
>
> Thanks.
>
> [1]: http://svn.turbion.org/turbion/trunk/turbion/utils/composition.py
> [2]: http://webnewage.org/post/2008/9/26/krasivaya-kompozitsiya/
> [3]:
> http://translate.google.com/translate?u=http%3A%2F%2Fwebnewage.org%2Fpost%2F2008%2F9%2F26%2Fkrasivaya-kompozitsiya%2F%23comment_720=en=UTF-8=ru=en
>
> On Sep 23, 12:03 am, Andrew Godwin <[EMAIL PROTECTED]> wrote:
>> So, hello everyone. I figure this list is the best place to ask this,
>> but please feel free to deride me if not...
>>
>> After all the talk of multiple databases, and non-relational  
>> databases
>> (bigtable, couchdb, etc.) that went on at DjangoCon and  
>> afterward,I've
>> been thinking about denormali[s/z]ation and how to make it easier in
>> Django; previously, I had thought denormalisation was something you  
>> did
>> to ruin your database, but I can now see it can be additive as well.
>>
>> In this vein, last weekend I decided to see how easy it would be to
>> write a magical DenormalisationField which handled all the  
>> synchronising
>> of data from the relevant tables given a normal ForeignKey  
>> relationship;
>> the results are athttp://www.aeracode.org/2008/9/14/denormalisation-follies/
>>
>> Basically, yes, it is possible, and a lot of people (eight! for my  
>> blog,
>> that's a lot) seem to like the idea; it certainly saves having to  
>> write
>> that replicate-on-save logic that otherwise goes with this kind of
>> schema design.
>>
>> Given the fact that it seems like a reasonably common thing to do in
>> large-scale applications, I'd like to know what people think about
>> adding this kind of thing into core (while I have a standalone
>> implementation, that needs some hacks I'd like to see gone). My main
>> issues with this idea are:
>>
>> 1) It relies on having a ForeignKey pointing to the other model, so  
>> this
>> might not work for non-relational databases (well, unless ForeignKeys
>> are still allowed in that case, but they just perform expensive  
>> SELECTs
>> as well as possibly printing abusive messages to stderr).
>>
>> 2) I'm not sure how common a use case this is; it sounds like it  
>> might
>> be verging into the long tail a bit.
>>
>> 3) People also want other things denormalised - count()s, aggregate
>> queries, etc. Of course, writing fields for these too is similar, and
>> I'd be happy to do it, but it might mean that even if the case for
>> denormalisation is good this is only a piece of the puzzle.
>>
>> So, I'd love people's opinions about where I should take this - to a
>> dark corner and hide it, to a separate app like django-extensions  
>> (which
>> involves keeping the horrid hacks in there), into a separate app but
>> with patches to core to make it less hackish (i.e. more signals),  
>> or add
>> it as my 1.1 pony with the proviso that I'm happy to write all the  
>> code.
>>
>> Andrew
> >


--~--~-~--~~~---~--~~
You received this message

Re: Denormalisation, magic, and is it really that useful?

2008-09-26 Thread Alex Koshelev


Hi, guys!

For a long time I have thoughts to make composition/denormalisation a
little bit easier and reusable. But I have no time to implement my
ideas.
Inspired by Anrew's blog post and its thread I recently wrote some
code. And I think that it really may be useful and cover most of
denormalisation use cases in Django.

And this is an example of Anrew's code but with my CompositionField:

class Picture(models.model):
user = models.ForeignKey(User)
user_username = CompositionField(
  native=models.CharField(max_length=150),
  trigger=dict(
  sender_model=User,
  field_holder_getter=lambda user:
user.picture_set.all(),
  do=lambda picture, user, signal:
user.username
)
 )
image = models.ImageField(upload_to="foo")
title = models.CharField(max_length=100)

Its more verbose but very generic and can be customized. And free
bonus - auto-generated `update_user_username` method.

The code available here [1]. Its very generic but allows to write more
high-level subclasses with some introspection.

Today I've written the blog post about it and my future suggestions.
It is in Russian[2], here is a google-translated version [3] (but with
broken code blocks) that I think can help to understand some concepts.
And it holds more usage examples.

I have many new ideas of how my solution can be improved and I what to
hear you opinions about it.

Thanks.

[1]: http://svn.turbion.org/turbion/trunk/turbion/utils/composition.py
[2]: http://webnewage.org/post/2008/9/26/krasivaya-kompozitsiya/
[3]:
http://translate.google.com/translate?u=http%3A%2F%2Fwebnewage.org%2Fpost%2F2008%2F9%2F26%2Fkrasivaya-kompozitsiya%2F%23comment_720=en=UTF-8=ru=en

On Sep 23, 12:03 am, Andrew Godwin <[EMAIL PROTECTED]> wrote:
> So, hello everyone. I figure this list is the best place to ask this,
> but please feel free to deride me if not...
>
> After all the talk of multiple databases, and non-relational databases
> (bigtable, couchdb, etc.) that went on at DjangoCon and afterward,I've
> been thinking about denormali[s/z]ation and how to make it easier in
> Django; previously, I had thought denormalisation was something you did
> to ruin your database, but I can now see it can be additive as well.
>
> In this vein, last weekend I decided to see how easy it would be to
> write a magical DenormalisationField which handled all the synchronising
> of data from the relevant tables given a normal ForeignKey relationship;
> the results are athttp://www.aeracode.org/2008/9/14/denormalisation-follies/
>
> Basically, yes, it is possible, and a lot of people (eight! for my blog,
> that's a lot) seem to like the idea; it certainly saves having to write
> that replicate-on-save logic that otherwise goes with this kind of
> schema design.
>
> Given the fact that it seems like a reasonably common thing to do in
> large-scale applications, I'd like to know what people think about
> adding this kind of thing into core (while I have a standalone
> implementation, that needs some hacks I'd like to see gone). My main
> issues with this idea are:
>
> 1) It relies on having a ForeignKey pointing to the other model, so this
> might not work for non-relational databases (well, unless ForeignKeys
> are still allowed in that case, but they just perform expensive SELECTs
> as well as possibly printing abusive messages to stderr).
>
> 2) I'm not sure how common a use case this is; it sounds like it might
> be verging into the long tail a bit.
>
> 3) People also want other things denormalised - count()s, aggregate
> queries, etc. Of course, writing fields for these too is similar, and
> I'd be happy to do it, but it might mean that even if the case for
> denormalisation is good this is only a piece of the puzzle.
>
> So, I'd love people's opinions about where I should take this - to a
> dark corner and hide it, to a separate app like django-extensions (which
> involves keeping the horrid hacks in there), into a separate app but
> with patches to core to make it less hackish (i.e. more signals), or add
> it as my 1.1 pony with the proviso that I'm happy to write all the code.
>
> Andrew
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-25 Thread David Cramer

CopiedField sounds a bit off, but otherwise I like the proposed additions.

On Thu, Sep 25, 2008 at 8:08 AM, Andrew Godwin <[EMAIL PROTECTED]>wrote:

>
> David Cramer wrote:
> > I would say ignore triggers on the DB level, until they've been
> > written in the framework.
> >
>
> Yes, this was essentially my point earlier; triggers would be nice to
> have from a consistency point of view, but it will be easier and quicker
> to reimplement them in Python, not least because it allows more complex
> functionality to be put straight in if we want it in the future. Then,
> we can always extend it so that some types of fields are implemented
> directly as triggers if you want once the API is settled.
>
> With all this in mind, my API proposal would be something like this:
>
> A CopiedField, called like so:   user_username =
> models.CopiedField("user", "username")   [here, "user" is the name of
> the ForeignKey to link through]
>
> An AggregateField:  post_count =
> models.AggregateField(Post.objects.filter(visible=True), "count")
>
> The CopiedField essentially works how my current DenormField does, but
> using a foreignkey to avoid you having to specify both it _and_ the
> model if you have more than one to the same model (a reasonable amount
> of the DenormField code is just for finding which foreignkey to link
> through, anyway)
>
> The AggregateField takes a simple queryset (from which it can extract
> the model to hook a signal onto), and whenever something in the queryset
> is saved (so, either an extra select on each save to see what items are
> in the queryset, or just updating the aggregate every time - possibly
> with either selectable, since the cost might be higher on either), and
> it also takes the function to run on the set - either a builtin name
> like "count" here, or an actual function which takes a queryset and
> returns a result (in this situation the type of the underlying column
> will also have to be specified).
>
> This should probably cover most cases, and has the advantage of people
> being able to specify their own in-python aggregate functions (such as
> working out the standard deviation of the users' ages) if they tell us
> what kind of field to sling the result in.
>
> I'm happy to change the field names, too, CopiedField especially is a
> little clunky.
>
> Andrew
>
> >
>


-- 
David Cramer
Director of Technology
iBegin
http://www.ibegin.com/

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-25 Thread Andrew Godwin

David Cramer wrote:
> I would say ignore triggers on the DB level, until they've been
> written in the framework.
>   

Yes, this was essentially my point earlier; triggers would be nice to
have from a consistency point of view, but it will be easier and quicker
to reimplement them in Python, not least because it allows more complex
functionality to be put straight in if we want it in the future. Then,
we can always extend it so that some types of fields are implemented
directly as triggers if you want once the API is settled.

With all this in mind, my API proposal would be something like this:

A CopiedField, called like so:   user_username =
models.CopiedField("user", "username")   [here, "user" is the name of
the ForeignKey to link through]

An AggregateField:  post_count =
models.AggregateField(Post.objects.filter(visible=True), "count")

The CopiedField essentially works how my current DenormField does, but
using a foreignkey to avoid you having to specify both it _and_ the
model if you have more than one to the same model (a reasonable amount
of the DenormField code is just for finding which foreignkey to link
through, anyway)

The AggregateField takes a simple queryset (from which it can extract
the model to hook a signal onto), and whenever something in the queryset
is saved (so, either an extra select on each save to see what items are
in the queryset, or just updating the aggregate every time - possibly
with either selectable, since the cost might be higher on either), and
it also takes the function to run on the set - either a builtin name
like "count" here, or an actual function which takes a queryset and
returns a result (in this situation the type of the underlying column
will also have to be specified).

This should probably cover most cases, and has the advantage of people
being able to specify their own in-python aggregate functions (such as
working out the standard deviation of the users' ages) if they tell us
what kind of field to sling the result in.

I'm happy to change the field names, too, CopiedField especially is a
little clunky.

Andrew

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-24 Thread David Cramer


Denormalized for me personally happens a LOT when it's not just a
simple foreign key. Maybe I want to store a many to many field as
name,name,name in a column (tags). We're already inside of Django so
it should already be available (it's not going to be much slower than
it is executing the subquery in sql to get the denormalized output).

On Sep 23, 4:19 pm, "Justin Fagnani" <[EMAIL PROTECTED]> wrote:
> On Tue, Sep 23, 2008 at 12:52 AM, David Cramer <[EMAIL PROTECTED]> wrote:>
> > For me, personally, it would be great if this could accept callables
> > as well. So you could store the username, like so, or you could store
> > a choices field like:
>
> >    field = models.IntegerField(choices=CHOICES)
> >    denorm = models.DenormField('self', 'get_field_display') # which
> > would rock if it was .field.display ;)
>
> I think denormalizing with callables is a very different thing than
> denormalizing with expressions that can be evaluated by the database.
> Not that they both aren't worth supporting, but db-level expressions
> are going to be far easier and more efficient to validate and update
> in bulk - no looping in Python, just executing SQL. In this case, I
> think your example would be better suited as an FK, for denorm
> purposes.
>
> Callables would be useful for something more complicated like
> abstracting auto_now to not be limited to dates, that is allowing a
> field value to be set by a callable on save, not just create, for any
> field type.
>
> On Tue, Sep 23, 2008 at 2:42 AM, Andrew Godwin <[EMAIL PROTECTED]> wrote:
> > Still, with an
> > AggregateField(Sandwiches.filter(filling="cheese").count()) it's still
> > possible to work out that you want to listen on the Sandwiches model,
> > and you could then fall back to re-running the count on every Sandwich
> > save, even if it ends up not having a cheese filling.
>
> I'm not sure I like the idea of accepting arbitrary QuerySets. It
> could just be my point-of-view, but I see automatic denormalization
> and aggregation as intimately tied, where a denormalized field is just
> a declarative aggregate expression that's optionally cached in a
> column. I think this makes it easy to understand and document, since
> aggregation queries and calculation fields would support the same
> features, and it also allows the implementation to share a lot with
> aggregates. It's also better in terms of storing and updating a
> calculation: you can calculate on reads or writes for N objects in one
> query without N subqueries, though it may involve a lot of joins.
>
> > So, I think the best approach would be one to replicate fields (like my
> > current DenormField; perhaps call it CopyField or something) and one to
> > cache aggregates (an AggregateField, like above).
>
> I'd also be hesitant to have two separate fields for these cases,
> since copying a related field value is just a simple SQL expression. I
> think the same calculation field could be used for both, by assuming a
> string is a field name:
>
> name = CalculationField('user.name')
>
> or by using F expressions:
>
> name = CalculationField(F('user.name'))
>
> Another approach to calculations, that doesn't necessarily involve
> denormalization, would be to use views. Russell talked to me a bit
> about this at djangocon, and I think the idea was that if you solve
> basing models on views (Isn't it actually possible now? Maybe we need
> read-only fields), and then have view creation support in the ORM,
> then calculation fields can be implemented with views. I see that
> un-stored calculations are re-implementing views, but I don't know
> enough about views to know whether they have some performance
> advantages over embedding the calculation in a query.
>
> -Justin
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-24 Thread David Cramer


I just wanted to touch on this point:

models.IntegerField(default=0) does not translate to INT(11) default
0; in SQL :)

Django doesn't nescesarily try to interact with the database properly
when it should, because it requires too much thinking (and more code
that hasn't been written).

I would say ignore triggers on the DB level, until they've been
written in the framework.

On Sep 23, 7:44 am, Steve Holden <[EMAIL PROTECTED]> wrote:
> This appears to be a proposal to re-implement triggers inside Django.
>
> I can see there are benefits if the underlying DB platform won't support
> triggers, but wouldn't triggers be the preferred solution when they're
> available? That way there is no chance that changes can be made outside
> the scope of the denormalization, and hence no need to recompute the
> denormalized values.
>
> regards
>  Steve
>
> Andrew Godwin wrote:
> > David Cramer wrote:
>
> >> If you're not doing denormalization in your database, most likely
> >> you're doing something wrong. I really like the approach that is
> >> offered here.
>
> >> For me, personally, it would be great if this could accept callables
> >> as well. So you could store the username, like so, or you could store
> >> a choices field like:
>
> >>     field = models.IntegerField(choices=CHOICES)
> >>     denorm = models.DenormField('self', 'get_field_display') # which
> >> would rock if it was .field.display ;)
>
> >> You could also make it somehow accept querysets or something similar
> >> for things like count(). I see a lot of possibilities and am a bit
> >> disappointed I didn't come up with something this easy for my use-
> >> cases.
>
> > The key is making sure you can listen for changes on whatever's at the
> > other end of your denormalisation. With my current snippet, it listens
> > for a save on the model the foreignkey points to, then checks for the
> > right ID; if we start accepting random querysets, then there has to be a
> > way to resolve that back to conditions the signal listener can understand.
>
> > Still, with an
> > AggregateField(Sandwiches.filter(filling="cheese").count()) it's still
> > possible to work out that you want to listen on the Sandwiches model,
> > and you could then fall back to re-running the count on every Sandwich
> > save, even if it ends up not having a cheese filling.
>
> > So, I think the best approach would be one to replicate fields (like my
> > current DenormField; perhaps call it CopyField or something) and one to
> > cache aggregates (an AggregateField, like above).
>
> > Simon Willison wrote:
>
> >> Just so it's on the record, I'd like any denormalisation tools in
> >> Django to include a mechanism for re-syncronizing them should
> >> something go awry (like direct updates being applied to the database
> >> without keeping the denormalised fields in sync). This mechanism could
> >> then be exposed as a ./manage.py command which could be called
> >> periodically to verify and fix any incorrect data.
>
> > Yes, this I very much agree with. The reason you always layer this stuff
> > on top of a pre-normalised database is because you can then rebuild the
> > data after playing with it externally.
>
> > Doing so shouldn't be too much of a problem; have a management command
> > that loads the models, and then just executes the update method on each
> > of the denormalisationalish fields.
>
> > Justin's idea of lazy updating is interesting, and probably quite doable
> > (as well as what most people will want by default on aggregate queries).
>
> > I'm also hoping any kind of aggregate denormalisation will work with any
> > future extended aggregate support, but if the field just takes a normal
> > QuerySet, that might Just Work™.
>
> > Andrew
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread Justin Fagnani

On Tue, Sep 23, 2008 at 12:52 AM, David Cramer <[EMAIL PROTECTED]> wrote:>
> For me, personally, it would be great if this could accept callables
> as well. So you could store the username, like so, or you could store
> a choices field like:
>
>field = models.IntegerField(choices=CHOICES)
>denorm = models.DenormField('self', 'get_field_display') # which
> would rock if it was .field.display ;)

I think denormalizing with callables is a very different thing than
denormalizing with expressions that can be evaluated by the database.
Not that they both aren't worth supporting, but db-level expressions
are going to be far easier and more efficient to validate and update
in bulk - no looping in Python, just executing SQL. In this case, I
think your example would be better suited as an FK, for denorm
purposes.

Callables would be useful for something more complicated like
abstracting auto_now to not be limited to dates, that is allowing a
field value to be set by a callable on save, not just create, for any
field type.

On Tue, Sep 23, 2008 at 2:42 AM, Andrew Godwin <[EMAIL PROTECTED]> wrote:
> Still, with an
> AggregateField(Sandwiches.filter(filling="cheese").count()) it's still
> possible to work out that you want to listen on the Sandwiches model,
> and you could then fall back to re-running the count on every Sandwich
> save, even if it ends up not having a cheese filling.

I'm not sure I like the idea of accepting arbitrary QuerySets. It
could just be my point-of-view, but I see automatic denormalization
and aggregation as intimately tied, where a denormalized field is just
a declarative aggregate expression that's optionally cached in a
column. I think this makes it easy to understand and document, since
aggregation queries and calculation fields would support the same
features, and it also allows the implementation to share a lot with
aggregates. It's also better in terms of storing and updating a
calculation: you can calculate on reads or writes for N objects in one
query without N subqueries, though it may involve a lot of joins.

> So, I think the best approach would be one to replicate fields (like my
> current DenormField; perhaps call it CopyField or something) and one to
> cache aggregates (an AggregateField, like above).

I'd also be hesitant to have two separate fields for these cases,
since copying a related field value is just a simple SQL expression. I
think the same calculation field could be used for both, by assuming a
string is a field name:

name = CalculationField('user.name')

or by using F expressions:

name = CalculationField(F('user.name'))

Another approach to calculations, that doesn't necessarily involve
denormalization, would be to use views. Russell talked to me a bit
about this at djangocon, and I think the idea was that if you solve
basing models on views (Isn't it actually possible now? Maybe we need
read-only fields), and then have view creation support in the ORM,
then calculation fields can be implemented with views. I see that
un-stored calculations are re-implementing views, but I don't know
enough about views to know whether they have some performance
advantages over embedding the calculation in a query.

-Justin

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread David Durham, Jr.


On Tue, Sep 23, 2008 at 7:44 AM, Steve Holden <[EMAIL PROTECTED]> wrote:
>
> This appears to be a proposal to re-implement triggers inside Django.
>
> I can see there are benefits if the underlying DB platform won't support
> triggers, but wouldn't triggers be the preferred solution when they're
> available? That way there is no chance that changes can be made outside
> the scope of the denormalization, and hence no need to recompute the
> denormalized values.

It would be nice to have an option to make atomic changes to the
normalized and denormalized tables.

-Dave

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread Simon Willison


On Sep 23, 2:00 pm, "Marty Alchin" <[EMAIL PROTECTED]> wrote:
> Without some Python-based approach, all I could see is maybe adding a
> cross-platform "create trigger" API (ugh) to Django, which an
> application could then use to set up its triggers during syncdb.
> Otherwise, something like that forum app would have to implement a
> trigger for all available backends or just ship with instructions on
> how to set it up yourself.

Triggers also aren't supported in Drizzle ( https://launchpad.net/drizzle
), which Django will probably want to target at some point.

Cheers,

Simon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread Marty Alchin

On Tue, Sep 23, 2008 at 8:44 AM, Steve Holden <[EMAIL PROTECTED]> wrote:
> This appears to be a proposal to re-implement triggers inside Django.

I suppose it is. But then, perhaps triggers were a re-implementation
of application-based denormalization. ... The chicken? The egg? The
world may never know. :)

> I can see there are benefits if the underlying DB platform won't support
> triggers, but wouldn't triggers be the preferred solution when they're
> available? That way there is no chance that changes can be made outside
> the scope of the denormalization, and hence no need to recompute the
> denormalized values.

Perhaps the biggest downside is that distributed applications can't
make use of it, since triggers aren't cross-platform. Sure, you could
argue that denormalization is best suited for individual projects,
rather than distributed apps, but I could see uses for it. Imagine a
full-blown forum app that wants to denormalize its post counts.

Without some Python-based approach, all I could see is maybe adding a
cross-platform "create trigger" API (ugh) to Django, which an
application could then use to set up its triggers during syncdb.
Otherwise, something like that forum app would have to implement a
trigger for all available backends or just ship with instructions on
how to set it up yourself.

That said, you're right, I would think it's usually safest to put that
stuff directly in the db. That way, if you have other applications
using other languages or perhaps just other ORMs, everything's all
intact. I'm pretty sure I've seen arguments on both sides of the
issue, though, so I'm not sure there's any objective answer to it.

-Gul

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread Andrew Godwin

Yes, Steve, it's true that triggers do have much of the same 
functionality as this kind of proposal, but as you say, part of the 
proposal is to make these hopefully work with future non-SQL databases 
(although, admittedly, that's only a small piece of the puzzle).

To be honest, my main drive is to have a nice, easy denormalisation API 
for Django; whether this means implementing triggers in all the database 
backends and using them behind this kind of API remains to be seen. The 
problems I can see with this at the moment are that:

 1) If you change a DenormField, there's less writing 
migrations/fiddling with the database than if we managed it all in 
Django code (where you could just resync)
 2) Triggers are a whole part of SQL Django just hasn't touched yet, and 
implementing support for them will take a reasonable amount of work in 
both debugging and testing for each backend.

I'd really like to go ahead and implement this stuff 'vanilla' at first, 
relying on the tried-and-tested field and signal code, and then if it 
gets popular, add in the option to run denormalisations using triggers 
rather than inside Django. That way, we eventually end up with the best 
of both worlds, and get people using more efficient schemas first, which 
they can then easily bake into triggers when they come along, and if 
their DB supports them.

Andrew

Steve Holden wrote:
> This appears to be a proposal to re-implement triggers inside Django.
>
> I can see there are benefits if the underlying DB platform won't support
> triggers, but wouldn't triggers be the preferred solution when they're
> available? That way there is no chance that changes can be made outside
> the scope of the denormalization, and hence no need to recompute the
> denormalized values.
>
> regards
>  Steve
>   

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread Steve Holden


This appears to be a proposal to re-implement triggers inside Django.

I can see there are benefits if the underlying DB platform won't support
triggers, but wouldn't triggers be the preferred solution when they're
available? That way there is no chance that changes can be made outside
the scope of the denormalization, and hence no need to recompute the
denormalized values.

regards
 Steve

Andrew Godwin wrote:
> David Cramer wrote:
>   
>> If you're not doing denormalization in your database, most likely
>> you're doing something wrong. I really like the approach that is
>> offered here.
>>
>> For me, personally, it would be great if this could accept callables
>> as well. So you could store the username, like so, or you could store
>> a choices field like:
>>
>> field = models.IntegerField(choices=CHOICES)
>> denorm = models.DenormField('self', 'get_field_display') # which
>> would rock if it was .field.display ;)
>>
>> You could also make it somehow accept querysets or something similar
>> for things like count(). I see a lot of possibilities and am a bit
>> disappointed I didn't come up with something this easy for my use-
>> cases.
>>   
>> 
>
> The key is making sure you can listen for changes on whatever's at the 
> other end of your denormalisation. With my current snippet, it listens 
> for a save on the model the foreignkey points to, then checks for the 
> right ID; if we start accepting random querysets, then there has to be a 
> way to resolve that back to conditions the signal listener can understand.
>
> Still, with an 
> AggregateField(Sandwiches.filter(filling="cheese").count()) it's still 
> possible to work out that you want to listen on the Sandwiches model, 
> and you could then fall back to re-running the count on every Sandwich 
> save, even if it ends up not having a cheese filling.
>
> So, I think the best approach would be one to replicate fields (like my 
> current DenormField; perhaps call it CopyField or something) and one to 
> cache aggregates (an AggregateField, like above).
>
> Simon Willison wrote:
>   
>> Just so it's on the record, I'd like any denormalisation tools in
>> Django to include a mechanism for re-syncronizing them should
>> something go awry (like direct updates being applied to the database
>> without keeping the denormalised fields in sync). This mechanism could
>> then be exposed as a ./manage.py command which could be called
>> periodically to verify and fix any incorrect data.
>> 
> Yes, this I very much agree with. The reason you always layer this stuff 
> on top of a pre-normalised database is because you can then rebuild the 
> data after playing with it externally.
>
> Doing so shouldn't be too much of a problem; have a management command 
> that loads the models, and then just executes the update method on each 
> of the denormalisationalish fields.
>
> Justin's idea of lazy updating is interesting, and probably quite doable 
> (as well as what most people will want by default on aggregate queries).
>
> I'm also hoping any kind of aggregate denormalisation will work with any 
> future extended aggregate support, but if the field just takes a normal 
> QuerySet, that might Just Work™.
>
> Andrew
>
>
> >
>
>   



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread Andrew Godwin

David Cramer wrote:
> If you're not doing denormalization in your database, most likely
> you're doing something wrong. I really like the approach that is
> offered here.
>
> For me, personally, it would be great if this could accept callables
> as well. So you could store the username, like so, or you could store
> a choices field like:
>
> field = models.IntegerField(choices=CHOICES)
> denorm = models.DenormField('self', 'get_field_display') # which
> would rock if it was .field.display ;)
>
> You could also make it somehow accept querysets or something similar
> for things like count(). I see a lot of possibilities and am a bit
> disappointed I didn't come up with something this easy for my use-
> cases.
>   

The key is making sure you can listen for changes on whatever's at the 
other end of your denormalisation. With my current snippet, it listens 
for a save on the model the foreignkey points to, then checks for the 
right ID; if we start accepting random querysets, then there has to be a 
way to resolve that back to conditions the signal listener can understand.

Still, with an 
AggregateField(Sandwiches.filter(filling="cheese").count()) it's still 
possible to work out that you want to listen on the Sandwiches model, 
and you could then fall back to re-running the count on every Sandwich 
save, even if it ends up not having a cheese filling.

So, I think the best approach would be one to replicate fields (like my 
current DenormField; perhaps call it CopyField or something) and one to 
cache aggregates (an AggregateField, like above).

Simon Willison wrote:
> Just so it's on the record, I'd like any denormalisation tools in
> Django to include a mechanism for re-syncronizing them should
> something go awry (like direct updates being applied to the database
> without keeping the denormalised fields in sync). This mechanism could
> then be exposed as a ./manage.py command which could be called
> periodically to verify and fix any incorrect data.
Yes, this I very much agree with. The reason you always layer this stuff 
on top of a pre-normalised database is because you can then rebuild the 
data after playing with it externally.

Doing so shouldn't be too much of a problem; have a management command 
that loads the models, and then just executes the update method on each 
of the denormalisationalish fields.

Justin's idea of lazy updating is interesting, and probably quite doable 
(as well as what most people will want by default on aggregate queries).

I'm also hoping any kind of aggregate denormalisation will work with any 
future extended aggregate support, but if the field just takes a normal 
QuerySet, that might Just Work™.

Andrew

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-23 Thread David Cramer


If you're not doing denormalization in your database, most likely
you're doing something wrong. I really like the approach that is
offered here.

For me, personally, it would be great if this could accept callables
as well. So you could store the username, like so, or you could store
a choices field like:

field = models.IntegerField(choices=CHOICES)
denorm = models.DenormField('self', 'get_field_display') # which
would rock if it was .field.display ;)

You could also make it somehow accept querysets or something similar
for things like count(). I see a lot of possibilities and am a bit
disappointed I didn't come up with something this easy for my use-
cases.

+1 from me!

On Sep 22, 7:00 pm, Simon Willison <[EMAIL PROTECTED]> wrote:
> On Sep 23, 12:21 am, "Justin Fagnani" <[EMAIL PROTECTED]>
> wrote:
>
> > In my experience at least, denormalization occurs
> > a lot and leaves a lot of room for mistakes, so it's something a
> > framework should handle if possible.
>
> Just so it's on the record, I'd like any denormalisation tools in
> Django to include a mechanism for re-syncronizing them should
> something go awry (like direct updates being applied to the database
> without keeping the denormalised fields in sync). This mechanism could
> then be exposed as a ./manage.py command which could be called
> periodically to verify and fix any incorrect data.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-22 Thread Simon Willison


On Sep 23, 12:21 am, "Justin Fagnani" <[EMAIL PROTECTED]>
wrote:
> In my experience at least, denormalization occurs
> a lot and leaves a lot of room for mistakes, so it's something a
> framework should handle if possible.

Just so it's on the record, I'd like any denormalisation tools in
Django to include a mechanism for re-syncronizing them should
something go awry (like direct updates being applied to the database
without keeping the denormalised fields in sync). This mechanism could
then be exposed as a ./manage.py command which could be called
periodically to verify and fix any incorrect data.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-22 Thread Justin Fagnani


I think any discussion of denormalization needs to include aggregation
as well, since most uses will involve more than simply looking up one
value form a related model. I also think this is something definitely
to include in core. In my experience at least, denormalization occurs
a lot and leaves a lot of room for mistakes, so it's something a
framework should handle if possible.

Ideally I'd like to see either a calculation field that can take
aggregation expressions and options about how to store and
synchronize.

I imaging something like this syntax:

class Order(models.Model):
# updates the order total if the line items change
subtotal = models.CalculationField(Sum('items.cost'), store=True, sync=True)

# doesn't update the order's customer name if the customer's name changes
customer = models.ForeignKey(Customer)
customer_name = models.CalculationField(F('customer.name'),
store=True, sync=False)

# don't store a rarely used value. calculate on select
weight = CalculationField(Sum('items.weight'), store=False)


This approach would require some changes to the ORM and aggregation,
but it'd be worth it since it'd make denormalization easy, flexible
and less error-prone. Fields would need to be able to contribute more
to queries and table creation, and we'd some simple need type
inference for expressions/aggregation. If calculation support is added
to all fields, or calculations have to have a declared type, then type
inference wouldn't be necessary, but type inference for aggregation
isn't hard.

-Justin

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-22 Thread Winsley von Spee


Hi,

this would be very useful for a lot of use cases. View counts that
originally came from a foo viewed bar table with timestamps for every
viewer, or storing some sort of sum calculation from a list of
related objects. Basically all that stuff you would display in an
"overview table" of some sort for which you would normally use
aggregation wich gets very slow very fast on large data sets.

I have written a generic denormalizer to store whole data sets
in denormalized tables to speed up searching.

It basically consists of a Metaclass that constructs a Model out of a
values queryset and a base class that adds method to generate sql to
keep the table up to date. This way most of the time you only have
to write the values query set and for special cases you can still add
"custom" Fields if you have to.

The tables get updated periodically, but it would be quite possible to
do this with signals instantly.

A previous version even generated triggers to keep the tables updated,
but this was dropped afterwards due to other reasons.

Queryset refactor did help a lot to make this sort of stuff easier.
Only quirk right now is that you have to use extra and "force"
queryset.query to use aggregation wich you'll need very often when
denormalizing.

I think a django.contrib.denormalization app would be a very nice
thingy.

+1 from me ;).

Regards


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

2008-09-22 Thread Marty Alchin

On Mon, Sep 22, 2008 at 4:03 PM, Andrew Godwin <[EMAIL PROTECTED]> wrote:
> So, I'd love people's opinions about where I should take this - to a
> dark corner and hide it, to a separate app like django-extensions (which
> involves keeping the horrid hacks in there), into a separate app but
> with patches to core to make it less hackish (i.e. more signals), or add
> it as my 1.1 pony with the proviso that I'm happy to write all the code.

Personally, I think it's best suited for a django-d13n (that's d13n
for denormali(s|z)ation) app that stands on its own. That way, you can
provide a host of other options as you see fit, have other people get
onboard and provide whatever they use, and so on.

I think there's definite merit to what you're doing, but I just don't
see it ever being in core, unless it requires some freakish internal
changes. If you do need changes to core, or you can at least frame
them in a way that would provide wider benefits, you'll definitely get
a fruitful discussion on it, at least.

-Gul

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Denormalisation, magic, and is it really that useful?

2008-09-22 Thread Andrew Godwin


So, hello everyone. I figure this list is the best place to ask this,
but please feel free to deride me if not...

After all the talk of multiple databases, and non-relational databases
(bigtable, couchdb, etc.) that went on at DjangoCon and afterward,I've
been thinking about denormali[s/z]ation and how to make it easier in
Django; previously, I had thought denormalisation was something you did
to ruin your database, but I can now see it can be additive as well.

In this vein, last weekend I decided to see how easy it would be to
write a magical DenormalisationField which handled all the synchronising
of data from the relevant tables given a normal ForeignKey relationship;
the results are at
http://www.aeracode.org/2008/9/14/denormalisation-follies/

Basically, yes, it is possible, and a lot of people (eight! for my blog,
that's a lot) seem to like the idea; it certainly saves having to write
that replicate-on-save logic that otherwise goes with this kind of
schema design.

Given the fact that it seems like a reasonably common thing to do in
large-scale applications, I'd like to know what people think about
adding this kind of thing into core (while I have a standalone
implementation, that needs some hacks I'd like to see gone). My main
issues with this idea are:

1) It relies on having a ForeignKey pointing to the other model, so this
might not work for non-relational databases (well, unless ForeignKeys
are still allowed in that case, but they just perform expensive SELECTs
as well as possibly printing abusive messages to stderr).

2) I'm not sure how common a use case this is; it sounds like it might
be verging into the long tail a bit.

3) People also want other things denormalised - count()s, aggregate
queries, etc. Of course, writing fields for these too is similar, and
I'd be happy to do it, but it might mean that even if the case for
denormalisation is good this is only a piece of the puzzle.

So, I'd love people's opinions about where I should take this - to a
dark corner and hide it, to a separate app like django-extensions (which
involves keeping the horrid hacks in there), into a separate app but
with patches to core to make it less hackish (i.e. more signals), or add
it as my 1.1 pony with the proviso that I'm happy to write all the code.

Andrew

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Re: Denormalisation, magic, and is it really that useful?

Denormalisation, magic, and is it really that useful?

20 matches

Site Navigation

Mail list logo

Footer information