Re: Denormalisation, magic, and is it really that useful?
On Tue, Sep 23, 2008 at 12:52 AM, David Cramer <[EMAIL PROTECTED]> wrote:> > For me, personally, it would be great if this could accept callables > as well. So you could store the username, like so, or you could store > a choices field like: > >field = models.IntegerField(choices=CHOICES) >denorm = models.DenormField('self', 'get_field_display') # which > would rock if it was .field.display ;) I think denormalizing with callables is a very different thing than denormalizing with expressions that can be evaluated by the database. Not that they both aren't worth supporting, but db-level expressions are going to be far easier and more efficient to validate and update in bulk - no looping in Python, just executing SQL. In this case, I think your example would be better suited as an FK, for denorm purposes. Callables would be useful for something more complicated like abstracting auto_now to not be limited to dates, that is allowing a field value to be set by a callable on save, not just create, for any field type. On Tue, Sep 23, 2008 at 2:42 AM, Andrew Godwin <[EMAIL PROTECTED]> wrote: > Still, with an > AggregateField(Sandwiches.filter(filling="cheese").count()) it's still > possible to work out that you want to listen on the Sandwiches model, > and you could then fall back to re-running the count on every Sandwich > save, even if it ends up not having a cheese filling. I'm not sure I like the idea of accepting arbitrary QuerySets. It could just be my point-of-view, but I see automatic denormalization and aggregation as intimately tied, where a denormalized field is just a declarative aggregate expression that's optionally cached in a column. I think this makes it easy to understand and document, since aggregation queries and calculation fields would support the same features, and it also allows the implementation to share a lot with aggregates. It's also better in terms of storing and updating a calculation: you can calculate on reads or writes for N objects in one query without N subqueries, though it may involve a lot of joins. > So, I think the best approach would be one to replicate fields (like my > current DenormField; perhaps call it CopyField or something) and one to > cache aggregates (an AggregateField, like above). I'd also be hesitant to have two separate fields for these cases, since copying a related field value is just a simple SQL expression. I think the same calculation field could be used for both, by assuming a string is a field name: name = CalculationField('user.name') or by using F expressions: name = CalculationField(F('user.name')) Another approach to calculations, that doesn't necessarily involve denormalization, would be to use views. Russell talked to me a bit about this at djangocon, and I think the idea was that if you solve basing models on views (Isn't it actually possible now? Maybe we need read-only fields), and then have view creation support in the ORM, then calculation fields can be implemented with views. I see that un-stored calculations are re-implementing views, but I don't know enough about views to know whether they have some performance advantages over embedding the calculation in a query. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal: label tag attributes
On Tue, Sep 16, 2008 at 2:41 AM, oggy <[EMAIL PROTECTED]> wrote: > On Sep 15, 10:49 pm, "Justin Fagnani" <[EMAIL PROTECTED]> > I can clearly see the appeal of the idea. Django can stay Javascript- > agnostic, while the community could develop a PrototypeForms, > DojoForms, etc. and it would be a single settings for the user. That would be the idea. You of course noticed some wrinkles, but I think it'd be awesome if you could install a djQuery or djojo app and get a bunch of form widgets and view base classes effectively give you AJAX support of of the box. > One problem I see is that for AJAX, you need to add view support. If I > were to switch from my regular form to an AJAX form with just a Meta > setting, how is my once-a-plain-CharFeild-but-now-auto-complete Widget > supposed to find its supporting view? I don't think that's too much of an issue. A CharField is never going to magically going to convert to an AutoCompleteField without specifying the data source to work from, but an AutoCompleteField could provide a common interface that would allow the dev to switch implementations. Maybe class based views could then be used to easily provide the correct output format. A ChoiceField could be AJAXified without additional configuration. >> 2) I've also thought that'd it would be really nice to have some way >> of dynamically transforming template output from within the template. >> Basically doing jQuery-type manipulations in templates, possibly with >> XPath or jQuery-style selectors. You could add a required class like >> this: >> >> {% transform %} >> {% addclass "tr:has(*.required) label" "required" %} >> {{ form }} >> {% endtransform %} > > That seems wicked cool, even if I don't understand it fully :D What > does "has(*.required)" select? A tr with a descendant with a > "required" class? It doesn't seem like XPath to me Yeah, tha'ts jQuery, not XPath. "tr:has(*.required)" would be approximately "tr[*[contains(@class, 'required')]]" in XPath, I think. XPath is ugly when using predicates. > But, as Simon pointed out, it might be a pretty big performance hog. > And generally the template library mostly contains easy-to-undestand > functionality, while "transform" just screams XSLT :) This is true, but since templates are usually used to generate HTML, it seems like a good idea to offer some HTML/XML specific features when there's a benefit. Not sure if this counts since it's easily contained in a third-party app. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Denormalisation, magic, and is it really that useful?
I think any discussion of denormalization needs to include aggregation as well, since most uses will involve more than simply looking up one value form a related model. I also think this is something definitely to include in core. In my experience at least, denormalization occurs a lot and leaves a lot of room for mistakes, so it's something a framework should handle if possible. Ideally I'd like to see either a calculation field that can take aggregation expressions and options about how to store and synchronize. I imaging something like this syntax: class Order(models.Model): # updates the order total if the line items change subtotal = models.CalculationField(Sum('items.cost'), store=True, sync=True) # doesn't update the order's customer name if the customer's name changes customer = models.ForeignKey(Customer) customer_name = models.CalculationField(F('customer.name'), store=True, sync=False) # don't store a rarely used value. calculate on select weight = CalculationField(Sum('items.weight'), store=False) This approach would require some changes to the ORM and aggregation, but it'd be worth it since it'd make denormalization easy, flexible and less error-prone. Fields would need to be able to contribute more to queries and table creation, and we'd some simple need type inference for expressions/aggregation. If calculation support is added to all fields, or calculations have to have a declared type, then type inference wouldn't be necessary, but type inference for aggregation isn't hard. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Signal Connection Decorators
Hey Zack, I just got a chance to look at this, and I like it, but have one suggestion. From a usage standpoint, wouldn't it be simpler to have the decorator just be the signal name, like @pre_save? I can't see any situation where you'd use a decorator for anything but connecting, so the ".connect" part just seems unnecessary. This could be implemented easily by adding a __call__() method to Signal. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Recursive inlines in admin?
I'm currently working on recursive inline editing using formsets, but not in the admin. It's not completely working yet, but it's pretty close. I've had to fix a number of things in formsets, and my patch for that is in #8160. I got side tracked from that project for a while, but I'll be back to it next week. Cheers, Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal: {% doctype %} and {% field %} tag for outputting form widgets as HTML or XHTML
On Wed, Sep 10, 2008 at 12:43 PM, Simon Willison <[EMAIL PROTECTED]> wrote: > Django proper was to modify Django's widget.render method to take an > optional as_xhtml=True/False argument. The {% form %} tag would then > set that argument based on context._doctype. > > I would also modify Context to have an is_xhtml() method which does > the "self._doctype in xhtml_doctypes" check. This isn't forms specific, but I have various classes that output HTML into templates and one pattern I've used is to take the context in their render() method, then display the variable with a tag that calls render(), like: {% render my_variable %} I think it'd be interesting to apply this pattern to general variable lookups, as it would solve this issue, plus be applicable to many more problems. Variable lookups could check for render() and call it if it's there, passing the context. Then you just call {{ my_variable }}, or {{ field }}. Too big of a change? -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal: user-friendly API for multi-database support
On Wed, Sep 10, 2008 at 12:30 PM, Simon Willison <[EMAIL PROTECTED]> wrote: > On Sep 10, 7:13 pm, "Justin Fagnani" <[EMAIL PROTECTED]> wrote: >> For application-wide db connections, I think it'd be much easier and >> more portable to choose the connection in settings.py rather than in a >> Model. > > That's a very interesting point, and one I hadn't considered. It makes > sense to allow people to over-ride the connection used by an > application they didn't write - for example, people may want to tell > Django that django.contrib.auth.User should live in a particular > database. Further-more, just allowing people to over-ride the > connection used for an existing application isn't enough - you need to > be able to over-ride the default get_connection method, since you > might want to shard Django's built in users (for example). I think this example highlights the problem with per-Model db connections: it'll only work if either that model is not related to the others in the app, or if the other models in the app also use the same db. This will probably make per-application db connections a much more common use case than per-Model. > 2. Have a setting which lets you say "for model auth.User, use the > get_connection method defined over here". This is made inelegant by > the fact that settings shouldn't really contain references to actual > function definitions, which means we would probably need to us a > 'dotted.path.to.a.function', which is crufty. Considering that this is how every module, function and class are referred to in setting, I don't think it'll be that big of a deal. I especially like Waylan's suggestion. > 3. Use a signal. There isn't much precedence in Django for signals > which alter the way in which something is done - normally signals are > used to inform another part of the code that something has happened. The nice thing about signals is that it allows any arbitrary scheme for selecting connections without modifying the application. For the User case above, you could register a function that chooses a replica for User queries only on selects which don't join with a model outside the auth app. I see your point about not changing how things are done with signals. I was thinking this would be done most simply by sending the QuerySet with the signal, but that opens things up to a lot more changes than just db connections. That could end being a way to introduce very hard to find bugs. I still like how easy it makes it to customize db access without altering the app itself. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: multi-column field support (#5929)
On Tue, Sep 9, 2008 at 4:08 PM, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote: > I've got most of it done locally, but I need to port it forwards to > match recent changes on trunk ... > if you haven't seen anything in a week, ping me again. Thanks, that's great to hear. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Ticket 8949: metronome/django-stones
I think one very important feature is submitting results back to djangoproject.com for comparison. Since Django is so dependent on underlying components it'll be very hard to compare results, but at the very least we can track things like: CPU type and speed python version memory (installed, free, python usage) OS loaded python modules And it might be worthwhile to run something like pybench just to give a baseline number for comparisons. With that data it should be a lot easier to make sense of the results. -Justin (oh, Metronome is a great name. Meter or Tempo are also good on several levels.) On Tue, Sep 9, 2008 at 5:02 PM, Simon Willison <[EMAIL PROTECTED]> wrote: > > On Sep 10, 12:24 am, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote: >> OK, enough noise on the naming. > > (I really like metronome) > >> Let's talk about what it should be and what should be measured. :) >> (I suspect some devs already have a sketch regarding this stuff. >> Please share.) >> >> Do we want it to result in one big number like python/Lib/test/pystone.py? > > I don't know much about benchmarking, but it seems to me it would be > most useful if we got one big number and about a dozen other numbers, > one for each category of performance testing. That would make it > easier to see if changes we made had an effect on a particular > subsystem, and also ties in nicely to your next point: > >> Do we want to provide hooks for apps to supply something to stones for >> site-specific stone testing? > > That seems sensible. It's like unit testing - we'll need code that > finds and loads the benchmarks for Django core, so we may as well get > it to look in user applications as well. > > As for what we measure, I think to start off with we just go with the > basics: startup, request cycle, template processing, signals and ORM. > If we get the wrapper mechanism right it will be easy to add further > stuff once we have those covered. > >> Also, what about things that affect downstream performance, but don't >> affect our runtime, like the HTTP Vary header? > > I say we ignore those entirely. Other tools like YSlow can pick up the > slack there. > > Cheers, > > Simon > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
multi-column field support (#5929)
I forgot to mention this as one of my ponies at djangocon, but I'd really love to have multi-column fields. I particularly need this for measurement and currency fields, which I think would be awesome to have built-in. So is anyone working on #5929? I see an email from Malcolm in March[1] that mentions that he might have had some pieces of the puzzle coming with qs-rf. At first glance it certainly doesn't appear too hard, except that I'm not sure how much this issue interacts with #373. I think I see where the logical changes would go and I'd be willing to put the work in on this over the next couple of weeks. Hope everyone's recovering and relaxing well post-1.0/djangocon. -Justin [1] http://groups.google.com/group/django-developers/msg/4f75b4d9f569f8a9 --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: BaseModelFormSet and ModelForm.instance
On Wed, Aug 20, 2008 at 8:39 AM, Brian Rosner <[EMAIL PROTECTED]> wrote: > I am slightly unclear on what is allowed to > be broken in this phase of Django development. I suspect it is okay > since those methods are not explicitly documented and a quick note on > the wiki would be decent. Someone please correct me if I am wrong, but > will make this item post-1.0. I could rewrite the patch to preserve those methods, but it'd be a lot less elegant. I could see the argument that save_existing_objects(), and save_new_objects() are useful in some ways. >> I don't think get_queryset is needed either. > > This however I just don't understand. What is the reasoning for this > change. Removing the method will certain break code and is documented, > but it seems the patch doesn't need this to be removed. I see there is > no need for in the save stage, but surely can still be used as an > extensibility hook. Passing the queryset directly in as initial data > has problems too and needs to be ran through model_to_dict to fix > those issues. I don't see why that needed to be removed. You are entirely correct. I forgot about the documentation of overriding get_queryset(). I actually didn't remove it, and if I had InlineFormSets would have broken, but I have to fix the patch to use it instead of self.queryset in some places. >> I know it's a rough patch, so any advice on how to improve it, or what >> tests to add? > > Just out of curiosity do the tests pass with your patch? If so, I > suspect the coverage isn't good enough ;) Yup, the tests pass. If I'm looking in the right places, the coverage seems pretty bad. ModelFormSets and InlineFormSets are not tested at all. Do you have a suggestion for not passing a queryset as the 'initial' argument to FormSet? I'm thinking that either the queryset is translated to a list of dicts like before, so the forms will have both an instance and initial data, or BaseFormSet.__init__() should be broken up so that _initial_form_count is set in another method that can be overridden. I like the later. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: BaseModelFormSet and ModelForm.instance
I attached a patch to #8160 that fixes this issue (and probably #8071). I'm not sure what parts of BaseModelFormSet are considered official API. In the patch ModelForm.save() is now called by ModelFormSet.save(), and I think the methods save_new, save_existing, save_existing_objects, and save_new_objects are no longer needed. I don't think get_queryset is needed either. I know it's a rough patch, so any advice on how to improve it, or what tests to add? -Justin ps. The *_factory methods seem odd to me. I wonder why metaclasses weren't used here, but I understand that it's to close to 1.0 change anything. On Sun, Aug 17, 2008 at 7:57 PM, Justin Fagnani <[EMAIL PROTECTED]> wrote: > I just noticed that BaseModelFormSet takes a queryset and populates > its forms initial data from the queryset, but doesn't set its forms > instance attributes. Was this does by design? > > If not, it seem like the bottleneck is that BaseModelFormSet is > completely relying on BaseFormSet to construct the forms, which > obviously doesn't know about ModelForms and instances, but if > BaseModelFormSet overrides _construct_form() is can set the instance. > I worked around it by subclassing BaseModelFormSet: > > class InstanceFormSet(BaseModelFormSet): >def _construct_form(self, i, **kwargs): >if i < self._initial_form_count: >defaults = {'instance': self.queryset[i]} >else: >defaults = {} >defaults.update(kwargs) >return super(BaseModelFormSet, self)._construct_form(i, **defaults) > > but this is how I would have expected BaseModelFormSet to behave. > Should I open a ticket? > > -Justin > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: initial data for inlineformset_factory
Uh.. disregard that. My mistake. I read the code wrong and BaseInlineFormSet does not take **kwargs and pass them to BaseModelFormSet.__init__(), so no you can't specify 'initial'. Which is good. -Justin On Tue, Aug 19, 2008 at 12:54 PM, Justin Fagnani <[EMAIL PROTECTED]> wrote: > On Tue, Aug 19, 2008 at 12:22 PM, Brian Rosner <[EMAIL PROTECTED]> wrote: >>Seems >> a bit overkill for the general case due to the unknown size of the >> queryset. > > It is possible to specify both a queryset and initial data with a > ModelFormSet or InlineModelFormSet, and initial will override the > queryset, but the queryset is still kept around. Looks like that could > cause some problems in save_existing_objects(), especially if initial > and queryset aren't the same size. > > -Justin > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: initial data for inlineformset_factory
On Tue, Aug 19, 2008 at 12:22 PM, Brian Rosner <[EMAIL PROTECTED]> wrote: >Seems > a bit overkill for the general case due to the unknown size of the > queryset. It is possible to specify both a queryset and initial data with a ModelFormSet or InlineModelFormSet, and initial will override the queryset, but the queryset is still kept around. Looks like that could cause some problems in save_existing_objects(), especially if initial and queryset aren't the same size. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Suggestions for inclusion_tag (and a problem with #5034?)
I ran into a problem trying to use inclusgion_tag() in combination with a template that used the url tag with #5034 applied. First, #5034, which patches the url tag to use request.urlconf if it's been set, checks for an instance of RequestContext, rather than the key 'request'. This could be changed, with the assumption that context['request'] should always be a request object. Then, inclusion_tag, unlike the include tag, creates a new context to pass to template.render(), rather than passing the current context. You can pass a context_class to inclusiong_tag, but since RequestContext takes a request, you can't use that. I wonder why inclusion_tag is different than include in this respect. Can't the dict returned by the tag function be pushed onto the current context? That way the template gets the same context class, it behaves more like the include tag, and the code actually becomes simpler. Is there some reason I'm not seeing? Also, is there any chance of #5034 making it into 1.0? I don't know how many user set request.urlconf via middleware, but that is documented, and this patch is critical for using url and reverse(). Thanks, Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Community representation, or, #django user "Magus-" needs to go far away
Ha, Magus is great. He's helped so many people, it's mind boggling. Rather than ban him, I'd say he should get donations. The problem on every IRC channel or email list is that often people don't actually listen to the advice given to them, or read the documentation pointed out to them. Magus may get a little short, but he's hardly unjustified from what I've seen. Sure, some others have more patience and are gentler, but no one on #django offers the sheer volume, correctness and depth of help that Magus does. I wonder though, what exactly do you propose? If Magus wants to be on #django, he'll be there no matter what anyone does. -Justin On Wed, Jun 25, 2008 at 2:21 PM, Tom Tobin <[EMAIL PROTECTED]> wrote: > > I don't spend much time in #django on Freenode, but for a moment, I'd > like you to check the logs of that channel. > > http://oebfare.com/logger/django/ > > Specifically, I'd like you to note interactions with user "Magus-" > (with trailing dash). > > I think we have a representation problem on our hands. If this guy is > the first person most users encounter, they're going to have a *very* > different view of the Django community than I consider ideal. I've > heard from several individuals that they've migrated from Ruby on > Rails *specifically because* our community was nicer; "Magus-" > threatens that. > > I've had an awful experience speaking with Magus; one of my co-workers > at The Onion has, as well. But don't take my word for it; check those > logs. > > This guy has to go. > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Aggregate Support to the ORM
Interesting problem. The real question is, what is the user's intent on a query like that? I have no idea. They probably are thinking along the lines of result #2 if they don't understand the underlying SQL and #1 if they do. It's probably not a good idea to assume either point of view. Using Sum in that example bring up a third possibility too: returning the same values as age, as if there weren't a join or a grouping. I wonder if aggregate functions on the quesyset's model should even be allowed in annotate(). I can't think of a case that makes sense. That restriction would certainly get rid of some ambiguity. In the same vane, maybe implicitly grouping by all fields when no values() arguments are present is a bad idea too. -Justin On Thu, May 1, 2008 at 7:07 AM, Nicolas E. Lara G. <[EMAIL PROTECTED]> wrote: > > Hello, > > I've been looking into the different use cases for aggregation and > came across one that I wasn't very sure what should be the expected > behaviour (regardless of the syntax used to express it) > > If we do something like: > Buyer.objects.all().annotate('purchases__quantity__sum', 'age__max') > or > Buyer.objects.all().annotate( Sum('purchases__quantity'), > Max('age) ) > > There are a few possibly expected results: > (1) [ >{'purchases__quantity__sum': 777L, > 'age': 35, > 'age__max': 35, > 'id': 2, > 'name': u'John'}, > {'purchases__quantity__sum': 787L, > 'age': 24, > 'age__max': 24, >'id': 1, > 'name': u'Peter'} > ] > > In this case we are returning the result of a query like this: > > SELECT "t_buyer"."id", "t_buyer"."name", "t_buyer"."age", > SUM("t_purchase"."quantity"), MAX("t_buyer"."age") FROM "t_buyer" > INNER JOIN "t_buyer_purchases" ON ("t_buyer"."id" = > "t_buyer_purchases"."buyer_id") INNER JOIN "t_purchase" ON > ("t_buyer_purchases"."purchase_id" = "t_purchase"."id") GROUP BY > "t_buyer"."id", "t_buyer"."name", "t_buyer"."age" > > And the aggregation on the whole model does not happen because of > the grouping ("select max(i) from x group by i" == "select i from x") > > (2) [ >{'purchases__quantity__sum': 777L, > 'age': 35, > 'age__max': 35, > 'id': 2, > 'name': u'John'}, > {'purchases__quantity__sum': 787L, > 'age': 24, > 'age__max': 35, > 'id': 1, > 'name': u'Peter'} > ] > > In this case we are seeing the result of two queries combined: > > The previous one for the annotation and this: > > SELECT MAX("t_buyer"."age") FROM "t_buyer" > >for aggregating on the whole model. > > With (1) we can get unexpected results. Imagine we were not using max > but sum instead, the buyer's age would be sumed as many times as he > has made a purchase. > With (2) we would have to make 2 queries while the user expects only > one two happen. Also for users that are used to sql, for very wicked > reasons, some user might be interested in executing a query that > actually sums for every relation. > > The strange query requirement is a very weak reason for this behavior > since for this kind of things you can always fall to sql. The number > of queries, on the other hand, is something that can be problematic. > Should we state that the number of queries of a aggregate should be > only one or is it ok, for user comodity, to have querysets that > perform more than one sql query? > > Other possible solution is to simply restrict that the aggregation can > only be done on either one table only or one pair of tables at a time. > We could also just stay with (1) and it would be the user's > responsability to make the adequate queries. > > What do you think? > > > -- > Nicolas Lara > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: API question for model saving
On Mon, Apr 28, 2008 at 12:11 PM, Mike Axiak <[EMAIL PROTECTED]> wrote: > I would rather see two different parameters, e.g.:: >.save([force_update=True|False, force_create=True|False]) > > That both default to False, and raise a ValueError when both are True. > I think this is by far the most understandable variation, even though there's an invalid combination. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Aggregate Support to the ORM
Hey everyone, Good to see this has come up again. Congrats on getting the GSoC selection Nicolas. The previous thread[1] has a lot of good discussion in it if anyone here wasn't following that. Obviously, I prefer the expression object API. I've actually been using it a bit in an order processing app I'm working on. I've been keeping the patches up to date with qs-rf and I just merged it with the new trunk. If anyone wants to check them out, where should I put them, as an attachment on #3566? (It works a bit differently than what's described here, folding all the behavior into values(), which might not be ideal) [1]: http://groups.google.com/group/django-developers/browse_thread/thread/3d4039161f6d2f5e/ I really like Honza's idea of an AggregateModel, at least for cases where there's a 1-1 correspondence between results and actual instances, so that the model will still behave as expected. To keep from cluttering the model's attributes, aggregate values could be put into another object or dict: >>>myproduct.aggregates['avg_price'] I like the idea less when the result would be a representative of a group. There could be unexpected results from calling methods on an instance because not all the data is there, or it has invalid values (averaging an integer field, etc). In these cases, I don't think it's a bad idea to require the use of values() and/or aggregate(). Also, there will probably be cases where we'd want to iterate over the members of the groups, so maybe instead of a list of dicts, aggregate() returns list of objects, so that a query like: >>> people = Person.objects.values('age').aggregate(Avg('income')) will return a list of objects that you can use like a dict: >>>people[0]['age'] and get a queryset from: >>>people[0].objects() On Sun, Apr 27, 2008 at 11:26 AM, Nicolas Lara <[EMAIL PROTECTED]> wrote: > Having multiple classes seems confusing. > I'm not sure why multiple classes would be confusing, since they do represent different behaviors. If it has to do with dealing with many classes, then it doesn't seem different than the many function names that need to be passed to the A class. > I would propose to have a single class (A?) to do the queries. So you > could do something like: > > aggregate(height=A('max', 'height'), av_friend_age=A('avg', > 'friend__age')) > At least for readability, I think this is clearer: aggregate(height=Max('height'), av_friend_age=Avg('friend__age')) In addition, some cases have special requirements and dealing with them in a class is easy. Cheers, Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: API question for model saving
On Sun, Apr 27, 2008 at 8:18 PM, Ken Arnold <[EMAIL PROTECTED]> wrote: > Possible fix: define more clearly whether Django thinks that the model > is in the database or not. Currently that's (IIRC) based on whether > the model has a primary key. But as Malcolm points out, a semantic > primary key breaks this assumption. > This would be extremely useful for me. I often need to make sure either an object is never updated, that only some fields are updated, or that an object can only be updated in certain cases. The object itself can determine when updates are allowed as long as it can tell whether it already exists in the DB and get the current instance if it does. Currently I use a combination of checking the primary key and querying the DB if it's set. If there were model methods like exists() and saved_instance(), then to force an update you could just use an if: if object.exists(): object.save() And in my cases, I could override save(): def save(self): c = self.saved_instance() if self.exists: c = self.saved_instance() if c.immutable_field != self.immutable_field: raise SomeError("immutable_field cannot be changed") if c.locked: raise SomeOtherError("Object %s is locked" % self.id) super(Model, self).save() This could be more complex than what Malcolm's looking for though. Usual disclaimers as well. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Auth and model inheritance idea
I started playing with subclassing the auth models today. What's nice is that when you subclass User with, say, UserProfile, all users get a .userprofile property, so the functionality of get_profile() is there for free with the bonus of multiple User subclasses if that's needed. I was thinking it might be useful if there was a setting like AUTH_USER_CLASS so that the specified subclass of User is used for request.user, create_user(), authenticate(), login(), the auth views, etc. Maybe something like this is already in the plans, since get_profile() seems out of place or needs updating for use with inheritance, but I couldn't find anything about it. cheers, Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Aggregates
Hey David, Right now sql.Query doesn't output the HAVING clause, but if it did I think you could do this with a custom filter with an add_to_query() method that calls setup_joins and appends to query.having. Also, with annotate() but without group_by(), as proposed, your query look like: >>> MyModel.objects.annotate (count=Count('my_other_model')).filter(count__gt=1) I'm not really sure if it's bad that the SQL in this case contains COUNT(my_other_model.id) rather than COUNT(1). Aggregation support in filter() might help make the intent more clear. With the function=tuple syntax I think your query would look something like: >>> MyModel.objects.filter(my_other_model__count__gt=1) With expressions, I can think of two options: >>> MyModel.objects.filter(Count('my_other_model') > 1) which would work by overloading __gt__ or >>> MyModel.objects.filter(GT(Count('my_other_model'), 1)) but I'm not sure I like either of those. I think the case against a group_by() method is that if you look what you're really trying to do here, you don't need an explicit group_by since you can tell what the grouping should be from the filter. Maybe try some more examples to see if group_by() is actually necessary. It could be that for all or most of your cases, implicit grouping works fine. Sorry -- I missed page 2. So GROUP BY and similar things will be > supported through an Aggregates base class? > I'm not sure which Aggregates class you're referring to, the base class for aggregate expressions, or the AggregateQuerySet? With the current version of expressions, each expression has an aggregate() method which traverses the tree and returns True if any children return True. If so it triggers auto-grouping based on all non-aggregating fields and expressions. The Aggregate base class returns True by default, but the logic is not based on isinstance(). This leaves the door open for stuff like an N-ary Min function that's only an aggregate if it has only one argument and that arg is a field name. Cheers, Justin On Sun, Mar 23, 2008 at 3:19 PM, David Cramer <[EMAIL PROTECTED]> wrote: > > So we're not going to support group by at all, even though is > extremely simple? I don't understand why everyone says mine (and > Curse's and many other peoples) use of SQL is so uncommon. I'd > guarantee almost every developer and I know who writes any kind of > scalable code (and I've seen what they've written) uses GROUP BY in > SQL much more than just for SUM, or COUNT, or any of those operations. > > There really is no logical reason to not allow logic in Django. Even > if it's in the extra() clause it's better than not having it. I > *refuse* to write raw SQL for something that took 30 minutes to patch > in, and I will continue to fork Django until the developers see fit to > include logical changes. While many of these are going to be present > in qsrf, and qsrf will make everything easier, it doesn't add much of > the *needed* functionality for complex projects (again, not everyone > is building simplistic blog software with Django). > > SELECT my_table.* FROM my_table JOIN my_other_table WHERE my_other > table.my_table_id = my_table.id GROUP BY my_table.id HAVING COUNT(1) > > 1; > > How do you ever hope to achieve something like this in the ORM without > patches like mine? This is a very useful query, and we used it heavily > for tagging applications, which seem to be fairly common nowadays. > Realistically, how do you efficiently want to query for objects that > have 2 tags without joining on the table N(tag) times? It seems like a > common enough use-case to add in this functionality. You have to > understand that DISTINCT doesn't solve every grouping situation that > is needed. > > My main argument is there's no reason to *not* include the > functionality except that a few select people say "we dont need it". > It's not causing performance overhead. It's not a lot of work. There > are patches already made (and if they're not on trac I have them > locally). I really don't want to have to switch to Storm, or Alchemy, > because honestly, I love Django's approach. The more I'm building > complex projects though, the more headaches I'm having, and this is > mainly due to the ORM being so damned limited, and databases being so > complex. > > On Mar 18, 6:36 pm, "Russell Keith-Magee" <[EMAIL PROTECTED]> > wrote: > > On Wed, Mar 19, 2008 at 9:26 AM, Justin Fagnani > > > > <[EMAIL PROTECTED]> wrote: > > > Hey Nicolas, > > > > > It seems to be a bad week to have this discussion with all the main > devs > > > busy with the PyCon sprint and aggregation not being a top priority, > but I > > > want to see if I can clarify my objection to the field__function=alia
Re: Aggregates
On Sun, Mar 23, 2008 at 12:23 PM, Winsley von Spee <[EMAIL PROTECTED]> wrote: > perhaps I've just read over it, but how to you plan to handle not > numeric types like datetime? In mysql you have to use something like > "FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(today))", but how should that be > handled in the orm especially as its backend dependant. > There's quite a few things that are backend dependent. String concatenation and dates are the big ones. Functions need a way to to get the proper SQL from the backend, or delegate their whole behavior to a backend specified class. With backend-defined functions, you example would look something like Date(Avg(Timestamp('date-col'))). (or possibly hide the explicit type conversions to allow Avg('date_col')) String concatenation is more difficult because to support using + requires a type inference system so that 'a' + 'b' emits CONCAT('a','b') on mysql. This is where open-ended expressions could get very complicated. Type inference isn't so bad at first look, but I'm not completely familiar with each databases coercion rules. It's possible that type inference has backend dependancies too, and we would want to hide that from the programmer. In a perfect world a valid expression should always work independent of the backend (so the coercion should be conservative and support the least-common denominator behavior), but in order for that to be true, then certain expressions which would work with one backend but not another should fail even when running on the backend that supports them. This all could become very tricky very quickly, and it might be better to pick a very limited subset of functionality to support at first, like say expressions only work with numbers, or numbers and dates. Something so that the first version is simple and just works, but that doesn't get in the way of supporting everything else in the future. For the time-being with grouping and extra, the date averaging could be done with SQL. -Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Pick a value from list variable in templates
There must be a lot of us who have created a similar filter. My particular version looks like this: from django.template import resolve_variable def lookup(value, key): return resolve_variable(key,value) so that it behaves like the normal attribute access.I use this to do things like send a list of model fields that a generic template should display. It's rarely used, so I don't see the need for new syntax when it's so easily done with a filter (or in the view). It might not hurt to include it by default. As for a new syntax (not that I'm actually proposing this) how bad would overloading : be, so that we'd have foo:bar for a lookup if foo is a variable? What's interesting to me about this idea is the possibility of passing a function as a context variable and calling it in the template this way. That could take the place of tags or filters in some cases where you're really only going to use it once. (I expect this idea might not go over well :) On Thu, Mar 6, 2008 at 6:49 PM, Adrian Holovaty <[EMAIL PROTECTED]> wrote: > > On Thu, Mar 6, 2008 at 8:25 PM, Ian Kelly <[EMAIL PROTECTED]> wrote: > > Why do I have a sudden fear that a branch is going to spawn called > > "newtemplates"? > > That won't be happening. OT: Even so, the syntax for passing arguments to filters is _extremely_ limiting. I think this is one place where Jinja has improved on Django templates. The best example of how useful this would be is string replacement like {{ variable|replace('foo', 'bar') }} I don't think it would take a "newtemplates" to add this type of argument passing. Justin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---