Re: Signal performance improvements

2008-03-24 Thread Brian Harring
On Mon, Mar 24, 2008 at 10:55:03PM -0500, Jeremy Dunck wrote:
> One other enhancement I thought might be good is to have a
> Model.pre_init signal

Such a signal exists already last I looked, same for post_init... 
Please clarify.

> Other similar class-based
> receiver lists might speed things up in some cases; the down-side is
> that each receiver list has some overhead, even if empty.

Please see ticket 4561... my attack of dispatch was to disable it when 
nothing was listening, dynamically re-enabling it when there was a 
listener.  Via that, there is *zero* overhead for disconnected signal 
pathways- slight overhead addition when something is listening 
(specifically addition of another function call), but pretty much 
neglible.

Suspect you and I came at it from different pathways; my plan was to 
target common optimization (model initialization, 25% boost when no 
listener connected), then work my way down.

The next step I was intending, presuming that patch ever got commited, 
was to move actual connect/send directly into the signal instances 
including the arg checking; might I suggest merging functionality of 
the two?

Literally, you're doing the second step I was after, while the first 
step levels a pretty significant gain for scenarios where signals 
aren't currently used.

~brian


pgpivunVoh9zI.pgp
Description: PGP signature


Re: Feature Request: "Abstract Model"

2008-02-04 Thread Brian Harring
On Mon, Feb 04, 2008 at 08:32:17AM -0500, George Vilches wrote:
>Since we've been using a metaclass for doing a similar task, seems
>appropriate to paste it now:
> 
>class ModelMixinBase(ModelBase):
>def __new__(cls, name, bases, attrs):
>new_attrs = attrs.copy()
>for attr,value in cls.__dict__.iteritems():
>if isinstance(value, models.Field):
>new_attrs.setdefault(attr,value)
>elif attr == 'methods':
>for v in value:
>if callable(v):
>new_attrs.setdefault(v.__name__, v)
>elif isinstance(v, str):
>new_attrs.setdefault(v, getattr(cls, v))
>return super(ModelMixinBase, cls).__new__(cls, name, bases,
>new_attrs)
>class MixinBaseA(ModelMixinBase):
>common_1 = models.IntegerField()
>common_2 = models.CharField(max_length=255)
>class ResultModel(models.Model):
>__metaclass__ = MixinBaseA
>specific_1 = models.IntegerField()
>The end result should give you as far as we have been able to tell a
>perfectly okay Django model instance (we've been using it for months
>and haven't seen any weird behavior yet).  We know it does a touch more
>than what yours does, but it could easily be stripped down to just be
>the equivalent of what you've got above.

I see only one particular fault with this approach- if you ever add 
functionality to allow N inheriting parents, instead of a single line 
of inheritance.  Django internally has a rather voodoo-rific 
creation_counter in the Field class namespace, that serves as an 
instance count for each Field derivative instantiated, and it uses 
that creation_counter to determine where to insert the Field 
derivative into the Models fields list (which maps out to the sql 
column ordering).

If you ever try to extend your metaclass approach to allow mixing 
multiple parents (which makes sense, imo), the sql column ordering 
would be dependant on the order of imports.

Aside from that, nifty approach although I think I'd try to mangle the 
ModelMixinBase instance so it was invokable, and then use it like so-

class ResultModel(MixinBaseA(), MixinBaseB()):
   specific_1 = models.IntegerField()

The reason I'd try that direction is due to the creation_counter 
voodoo- if you could slightly bastardize ModelMixinBase.__call__ so 
that it was able return clones of the fields (with the 
creation_counter incremented), it would solve the sql field order 
issue I mentioned above while enabling N parent inheritance.

Just a thought.
~brian


pgpMlTjNEbvE4.pgp
Description: PGP signature


Re: Feature Request: "Abstract Model"

2008-02-04 Thread Brian Harring
On Sat, Jan 12, 2008 at 02:15:32PM -0500, Wanrong Lin wrote:
> I have a data model case that I think probably is quite common to other 
> people too, and I did not find a way to do it with current Django, so I 
> wonder whether the developers can take a look of it.
> 
> My situation is:
> 
> I have quite a few data model classes that share some common fields, so 
> I have the following model code:
> 
> 
> class Common(models.Model):
> # common fields
> ..
> 
> class Model_A(Common):
> # extra fields
> ..
> 
> class Model_B(Common):
> # extra fields
> ..
> ---
> 
> That works, except that a database table will be created for "Common", 
> which will never be used by itself.
> 
> So I will just keep "Common" table empty, not a big deal. But, to make 
> it more elegant (which I suspect a lot of Python programmers are 
> obsessed about), can we add some kind of mechanism to tell Django that 
> "Common" is an "abstract model" that is not intended to be used by 
> itself, so no table needs to be created?

This is a dirty hack mind you, but a rather effective one- I 
personally use it for when I need to create common structures w/in 
tables and need to able to change the structure definitions in a 
single spot.  If you did the following-

def add_common_fields(local_scope):
local_scope['field1'] = models.IntegerField(blank=True, 
   null=True)
local_scope['field2'] = models.CharField(maxlength=255, 
   blank=True, null=True)
# other common definitions, same thing, updating the passed in
# dict

you could then just do

class Model_A(models.Model):
   add_common_fields(locals())
   # other fields

class Model_B(models.Model):
   add_common_fields(locals())
   # other fields.

Pros of the approach:
1) you're easily able to add common fields to model definitions, and 
 it Just Works (TM)
2) since the class scope is executed in order, via shifting around the 
 add_common_fields invocation you can shift the sql column definition 
 as needed.

Cons:
1) exploits the fact locals() in class scope is a mutable dict- I've 
 yet to see commentary indicating this will change anytime soon for 
 cpython, but I've no idea if this works in ironpython/jython (assume 
 so due to metaclass semantics, but I've not tested it).
2) if you've never seen that trick before and come across it in code, 
 it's likely going to confuse the hell out of the person examining it.  
 Comments likely warranted to combat that.
3) inheritance would be a bit more pythonic (although inheritance 
 requires some funkyness to be able to control field ordering).

You probably could fold the approach above into a metaclass if desired 
also- would be a bit more pythonic possibly.

Either way, it's a useful trick, so hopefully it helps-
~brian


pgpaCFiNDILUl.pgp
Description: PGP signature


Re: Proposal: deprecated Model.__init__(*args)

2008-02-04 Thread Brian Harring
On Tue, Jan 29, 2008 at 03:43:22PM -0600, Jacob Kaplan-Moss wrote:
> On 1/29/08, Ivan Illarionov <[EMAIL PROTECTED]> wrote:
> > Yes, but why not have fromtuple classmethod optimized for this use-
> > case? And I believe this way of initialization could be really faster
> > than keyword initialization.
> 
> I'm pretty strongly opposed to a fromtuple (or whatever) method: it's
> brittle and to tightly couples code to the arbitrary order of defined
> fields.

Agreed- if the dev is avoiding repeating themselves, either it'll 
end up requiring people to convert args into kwargs, or vice versa, 
and then invoking a custom method w/ that object- in the end slowing 
it down while making it rather fugly.


> Speed also 
> 
> As it stands now (in QSRF r7049), args are indeed faster than kwargs::
> 
> >>> t1 = timeit.Timer("Person(1, 'First', 'Last')", "from blah.models
> import Person")
> >>> t2 = timeit.Timer("Person(id=1, first='First', last='Last')",
> "from blah.models import Person")
> >>> t1.timeit()
> 25.09495210647583
> >>> t2.timeit()
> 36.52219820022583
> 
> However, much of that extra time is spent dealing with the args/kwargs
> confusion; chopping out the code that handles initialization from args
> gives better timing:

In hindsight, the pops can be optimized out via caching a set of the 
allowed field names, and doing a differencing of that set against 
kwargs- would avoid the semi-costly exception throwing (at least 
KeyError) in addition.

Additional upshot of doing that conversion is that you would get a 
listing of *all* invalid kwargs provided, instead of one of 
potentially many kwargs that didn't match a field name.

Might want to take a stab at that, since that likely will reduce the 
runtime a bit thus reducing the gap.


> >>> t = timeit.Timer("Person(id=1, first='First', last='Last')", "from
> blah.models import Person")
> >>> t.timeit()
> 29.819494962692261
> 
> So that's about a 15% speed improvement over the current
> __init__(**kwargs) at the cost of losing that same 15% since you can't
> do *args.

Honestly, I dislike from_tuple instantiation, but I dislike chucking 
out *args support w/out making kwargs processing far closer in 
processing speed. Reasoning is pretty simple- most peoples implicit 
usage of Model.__init__ is instantiation of db returned results, row 
results in other words.  I'm sure there are people out there who have 
far heavier writes then reads db wise, but I'd expect the majority of 
django consumers are doing reads, which in my own profiling still was 
one of the most costly slices of django CPU runtime.

So, if row based instantiations are the majority usage, and DBAPI 
(last I read) doesn't mandate a default way to fetch a dict 
instead of a row, might want to take a closer look at the cost of row 
conversion for Model.__init__.  It'll be more then 15% slowdown of db 
access processing wise and should be measured- at the very least, to 
know what fully what is gained/lost in the transition.


> [Note, however, that this speed argument is a bit silly when __init__
> fires two extremely slow signals. Improving signal performance --
> which is very high on my todo list -- will probably make this whole
> performance debate a bit silly]

Kindly tag ticket 4561 in then- signaling speed gains can occur, but 
nothing will match the speed of *not* signaling, instead signaling 
only when something is listening (which that ticket specifically adds 
for __init__, yielding ~25% reduction).

Just to be clear, I understand the intent of making the code/api 
simplier- in the same instant, we're talking about one of the most 
critical chunks of django speed wise in my experience.  Minor losses 
in performance can occur, but I'd rather not see a 20% overhead 
addition for it.

Aside from that, if you can post the patch somewhere for what you've 
tweaked, I'd be curious.

Thanks,
~brian


pgpTu89gIPdoa.pgp
Description: PGP signature


Re: middleware exception catching for core handlers

2008-01-05 Thread Brian Harring
On Tue, Nov 20, 2007 at 02:30:24PM -0600, Gary Wilson Jr. wrote:
> 
> We don't seem to be catching exceptions raised in response or request 
> middleware, causing bare exceptions to be displayed.  Is there a reason 
> we are not catching these and displaying pretty error pages?
> 
> Which leads me to another observation.  There seems to be a bit of 
> duplication in the WSGIHandler/ModPythonHandler and 
> WSGIRequest/ModPythonRequest classes.

Resurrecting this email, anyone got a valid reason for not 
intercepting these exceptions?

I know I got a nasty surprise when I noticed it in my apache logs.

~brian


pgpkiD3ryICyL.pgp
Description: PGP signature


Re: signals

2008-01-04 Thread Brian Harring
On Thu, Dec 20, 2007 at 10:45:24PM -0600, Jeremy Dunck wrote:
> 
> On Jun 13, 2007 8:29 PM, Brian Harring <[EMAIL PROTECTED]> wrote:
> ...
> > Either way, I'm still playing locally, but I strongly suspect I'll
> > wind up chucking the main core of dispatch.dispatcher and folding it
> > into per signal handling; may not pan out, but at this point it seems
> > the logical next step for where I'm at patch wise.
> ...
> 
> 
> Brian,

Helps to CC me, been crazy busy lately over the last few months and 
watching this ml less and less unfortunately.


>   Did anything come of this?  I'm interested in several uses of
> signals.  At the Dec 1 sprint, Jacob said he'd prefer not to add any
> signals until the performance issues are addressed.
>   I'm willing to work at improving the performance, but David
> indicated that you may have something worth supplying back to django
> trunk?

Haven't gone any further on my signals work since #4561 is currently 
bitrotting- the intention was to basically shift the listener tracking  
all into individual signal objects; the basic machinery required is 
avail. in the patches posted on that ticket.

Presuming folks liked the approach, the intended next step was to 
shift away from using django.dispatch.dispatcher.connect to bind a 
callable to a signal, to signal_inst.connect(func) as the default, 
gutting the remaining dispathcer internals, shifting it into the 
signal objects themselves.

In the process, moving the robustapply bits up from checking everytime 
the signal fires, to checking when the connection is established- end 
result of that would be faster signaling, and reduction of a lot of 
the nastyness robustapply attempts at the point of signaling.

Either way, #4561 is the guts of my intentions- resurrecting signals 
work starts at resurrecting that patch ;)

If there is interest in getting this commited, I would like to know- 
that patch alone was another 25% speed up in model instantiation when 
no listener was connected for that model.

One additional optimization point for signals is deciding whether
connect(Any, f1)
connect(Any, f2)

Must execute f1, f2 in that explicit order- internally, dispatcher 
uses some nasty linear lookups that could be gutted if f1/f2 ordering 
can vary.  Never got any real feedback on that one, so have left it 
alone.
~brian


pgpgKdp4SxzJG.pgp
Description: PGP signature


Re: Some concerns about [6399]

2007-09-21 Thread Brian Harring
On Fri, Sep 21, 2007 at 09:43:26PM -0500, James Bennett wrote:
> 
> While I'm all in favor of faster template variable resolution, I've
> been poking at the new Variable class and noticed that the patch was
> apparently never updated for a post-Unicode Django; in some fairly
> simple tests I've been able to break it pretty badly by passing in
> Unicode strings and watching it blow up when it tries to output them.

Examples would be lovely; that particularly patch proceeded the 
unicode merge by 6 or so months :)

The only spot I can see this potentially causing issues is for 
the 'literal' fast pathing, since it just reuses what was passed in- 
forcing that to unicode in __init__ ought to resolve that.  If it's 
something other then that, definitely after examples.


Related, Jacob why did you combine the lookup variable with the 
literal variable?  While you can combine them, that means
1) extra work that's being done without reason per resolve 
(admittedly background noise work for the check per resolve)
2) makes it harder for any parsed tree optimizer to collapse 
literals down.

The latter is the real reason I split it up- I'm well aware a template 
optimizer might sound insane (why optimize it when it's parsed 
everytime?), but it's a matter of orthogonal design- do it now, get a 
background noise gain, but also knock off a step towards a different 
goal.  Reason I'm still thinking in that direction is that the 
template rendering starts to break down a bit when you're rendering a 
massive number of nodes (think for loops of for loops of for loops ;), 
and anything that can reduce amount of work involved would be useful.

It's not a huge thing, but I didn't see any reason for the change, 
thus asking now.

Thanks,
~harring


pgptNyNFTvU7M.pgp
Description: PGP signature


Re: Creating and using a project-specific database backend?

2007-08-30 Thread Brian Harring
On Thu, Aug 30, 2007 at 07:20:09PM +0800, Russell Keith-Magee wrote:
> 
> On 8/30/07, George Vilches <[EMAIL PROTECTED]> wrote:
> >
> > Folks,
> >
> > Now that the database backend refactoring has landed, and DB
> > functionality is really easy to extend, how does everyone feel about the
> > possibility of allowing people to specify their own database backends
> > within their projects (i.e., without modifying the Django source tree in
> > any way?)  I see this as an excellent way for people to increase their
> 
> The broad idea seems reasonable to me. There's no point having an
> easily pluggable database engine if you can't plug in your own
> database :-)
>
> Regarding the approach - I'm inclined to prefer #1. It's simple and
> easy to explain, and I don't see that there is that much potential for
> side effects.

+1 from me; original patch v2 did this already anyways (we 
specifically need it).

So... we can please get the remaining bits of the backend refactoring 
pulled in?  :)

~harring


pgpFoTGhZeHmV.pgp
Description: PGP signature


Re: db backend refactoring

2007-08-22 Thread Brian Harring
On Mon, Aug 20, 2007 at 03:13:52PM +1000, Malcolm Tredinnick wrote:
> 
> On Sun, 2007-08-19 at 23:01 -0500, Adrian Holovaty wrote:
> [...]
> > I haven't yet made DatabaseError, IntegrityError and the introspection
> > and creation functionality accessible via the django.db.connection
> > object. Those changes can happen in the near future, if they need to
> > be made.
> 
> If we do move those (not clear why it would be needed, though),

Ironically, it's kind of core to why I did the refactoring, and I 
covered it already in the original email (cons 3 & 4) :)
Expanding a bit, intention was to cleanup access for multidb crap, 
and to enable simple wrappers to be generated/used.  Think through 
whats required to wrap a random backend (say mysql) right now; you 
have to have either 

1) a bunch of crappy one line modules importing from the actual target 
(if you're doing it right, it checks a global to know which backend to 
wrap, or if you're in a hurry you wind up with 
seperate persistant-mysql and persistant-sqlite backends ;)
2) fun code that does module creation/injection via sys.modules

Both routes kind of suck in a single connection environment; that 
said, both routes are *unusable* in multidb.  Think it through- if 
the two backends are mysql, hey hey, same exception, lucky you.  What 
if the backends are mysql and pgsql, and you need to catch the mysql 
db exceptions? [1]

Basically, 

from django.db import connection, IntegrityError, DatabaseError
try:connection.do_something
except (IntegrityError, DatabaseError): pass

vs 

from django.db import connection
try:connection.do_something
except (connection.IntegrityError, connection.DatabaseError): pass

The latter localizes the exception class to the connection itself- 
more importantly, it allows (in a true mdb env) for passing around a 
namespace for accessing that *specific* connection, not trying to 
catch an exception from a non-default db/cursor with the default dbs 
exception classes (as is the attempted case for mdb branch).

That ought to lay out the logic for why localizing the specifics of 
each connection implementation *to* the connection object itself is 
needed.


[1]: while one could point at multidb branch's
django.db.DatabaseError as contradicting this, please verify that code 
works- that code tries to catch exception x (class A) with class B, 
with no relation between the two exception classes aside from a common 
ancestor class; the localizing bits in that are just noise, see no 
possible way that could work.


> it'd be nice to leave backwards compat references in place. We pull out those
> exceptions into a db-neutral namespace because people are using them a
> lot. So changing it will have a big impact and leaving an alias in place
> is free.

Not free; long term price is extracted in feature implementation due 
to it existing.  Simpler/saner solution is to have a 
DatabaseError/IntegrityError exception class (common) at that 
location; from there, wrap the exceptions thrown from the underlying 
backend in it.  Sucks that the tb from the backend is truncated, but 
realistically that info can be stored if needs be (rarely is there 
much info anyways, since it usually terminates in cpy code).

Mind you I'm talking about just the exceptions; the misc ops, etc, 
generally think they should be shift over, and code should migrate to 
accessing the sub-namespaces from the actual connection object.


So... unless folks either don't want that functionality, or can poke 
large holes through my args above, it's needed.  One thing I would 
like folded in from the v2 patch however is the support for accessing 
backends from locations other then django.db.backends.* (not required, 
but the collapsing the namespace onto the connection object helps 
this)- mainly, I don't think it's a good practice to require folks to 
monkeypatch stuff into django when they want to use custom backends.  

Still need to go through and do the backends cleanup/refactoring, but 
probably won't get to it till this weekend- depending on feedback from 
this email, may include the chunks of my v2 that were left out also.

thanks,
~harring


pgpZGVvjMTK8N.pgp
Description: PGP signature


Re: db backend refactoring

2007-08-14 Thread Brian Harring
On Wed, Aug 15, 2007 at 09:46:43AM +0700, Ben Ford wrote:
>Hi Brian,
>Just a quick question from me regarding your comment about supporting
>multiple databases... I've had a quick look through your patch and it
>seems to cover a lot of the ground that the multi-db has already
>changed.
>  * I was wondering if you're familiar at all with mulitple-db branch
>yet? (You sent me an  email about getting the patch for it a little
>while ago, and just wondered if you'd looked at the existing code).

Familiar (walked the 4188 last integration merge up), although I'm 
still waiting on you to post the svk/branch publically ;)

>  * If you are do you have any idea how much your patch will impact the
>existing work?

Multidb crap is reliant on work of this sort; encapsulating the 
connection basically, so that mdb can handle N dbs.

>  * How do you see supporting multiple databases progressing?

MDB layers over what I'm working on; mdb's connection 
global implementation (namely attempting to still rely on the global), 
specifically the attempt to collapse the msic backend bits down isn't 
clean at all imo.

>I've been really busy lately, but if you think it might help, I can
>update the branch with the latest changes from trunk and either fire
>you a patch or try and get the mirrored SVK repo somewhere public...
>Cheers,

Updating it would be useful; I went digging on sunday before doing 
this, but what mdb has for the backend cleanup, still needs the same 
cleanup I'm doing.

~harring


pgpjvAkBCepoJ.pgp
Description: PGP signature


Re: db backend refactoring

2007-08-14 Thread Brian Harring
On Tue, Aug 14, 2007 at 04:36:22PM -0500, Adrian Holovaty wrote:
> After a cursory read-through the patch, my only major complaint is the
> unnecessary cleverness of some of the code. 

Alt example besides BackendOps/BackendCapabilities would be 
appreciated (your phrasing implies there is more); those two classes 
are frankly borderline frankenstein at the moment due to continual 
mangling.


> The stuff in BackendOps,
> for example, could be rewritten in a much clearer way. (Granted, it
> would be longer, but that's an acceptable tradeoff, in my opinion.)
> 
> For example, this is in the patch:
> 
> class BackendOps(object):
> required_methods = ["%s_sql" % x for x in
> ('autoinc', 'date_extract', 'date_trunc', 'datetime_cast',
> 'deferrable', 'drop_foreignkey', 'fulltext_search', 'limit_offset',
> 'random_function', 'start_transaction', 'tablespace')]
> required_methods += ['last_insert_id', 'max_name_length',
> 'pk_default_value', 'sequence_name', 'get_trigger_name']
> required_methods = ["get_%s" % x for x in required_methods]
> required_methods = tuple(required_methods)
> 
> And this is a much, much clearer way of expressing exactly the same thing:
> 
> class BackendOps(object):
> required_methods = (
> 'get_autoinc_sql',
> 'get_date_extract_sql',
> 'get_date_trunc_sql',
> 'get_datetime_cast_sql',
> 'get_deferrable_sql',
> 'get_drop_foreignkey_sql',
> 'get_fulltext_search_sql',
> 'get_limit_offset_sql',
> 'get_random_function_sql',
> 'get_start_transaction_sql',
> 'get_tablespace_sql',
> 'get_last_insert_id',
> 'get_max_name_length',
> 'get_pk_default_value',
> 'get_sequence_name',
> 'get_get_trigger_name'
> )
> 
> Not only is it clearer, but it saves us from having to do calculations
> at run time.

While that may skip a calculation, honestly surprised there is no 
complaint about 2-10 lines below where it's shoving a curried func 
into locals (half expected a knee jerk "yuck" on that one ;).

As indicated further up, those two classes will need another cleanup; 
the currying/locals I'd advocate leaving in, but expanding 
required_methods there is definitely doable- it's form right now is 
just a left over from how I went about pinning down what ops were 
'standard' for backends.  Crap like that will be removed next time I 
do a full walk of the changes (this round is basically a "this is the 
api I'm proposing, and here is a working patch for the api").

So... any spots that don't read as just idiocy?  Specific complaints 
about the presented API, seperation, etc, is what I'd like to get if 
possible- still have a lot of gutting to do internally anyways, so the 
form of it ought to improve.

~harring


pgp9XynR1TjiN.pgp
Description: PGP signature


Re: Proposal: runserver --with-fixture

2007-08-13 Thread Brian Harring
On Mon, Aug 13, 2007 at 04:31:42PM -0500, Adrian Holovaty wrote:
> So I was writing Django view unit tests and setting up fixtures with
> sample data, and it hit me -- wouldn't it be useful we made it easy to
> run the Django development server with fixture data?
> 
> I'm proposing a "--with-fixture" flag to django-admin.py, so that you
> could do something like this:
> 
> django-admin.py runserver --with-fixture=mydata.json
> 
> With this command, Django would:
> 
> * Create a new test database (following the TEST_DATABASE_NAME setting).
> 
> * Import the fixture data into that fresh test database (just as the
> unit test framework does).
> 
> * Change the DATABASE_NAME setting in memory to point to the test database.
> 
> * Run the development server, pointing at the test database.
> 
> * Delete the test database when the development server is stopped.
> 
> The main benefit of this feature is that it would let developers poke
> around their fixture data and, essentially, walk through unit tests
> manually (in a Web browser rather than programatically).

unittests, or doctests?  walking through unittests is already viable, 
docttests, not really (or at least not even remotely pleasant).


> In the future, the next step would be to keep track of any database
> changes and optionally serialize them back into the fixture when the
> server is stopped. This would let people add fixture data through
> their Web application itself, whether it's the Django admin site or
> something else. Now *that* would be cool and useful! But I'm only
> proposing the first part of this for now.

That one actually worries me a bit; personally, doctests I view as 
great for *verifying* documentation, but when used as api tests (as 
django does), it has two failings imo-


1) doctests are really a pita if you've grown accustomed to the 
capabilities of normal unittests; further, they grow unwieldly rather 
quickly- a good example would be regressiontests/forms/tests.py; 
I challenge anyone to try debugging a failure around line 2500 on that 
beast; sure, you can try chunking everything leading up to that point, 
but it still is a collosal pain in the ass to do so when compared to 
running

trial snakeoil.test.test_mappings.FoldingDictTest.testNoPreserver

Basically, I'm strongly of the opinion doctests while easy to create, 
aren't all that useful when something breaks for true assertions; very 
least, ability to run a specific failing fixture (the above trial 
command) is damn useful, but not viable with doctests.


2) Adding functionality to automatically collect/serialize a stream of 
interp. commands, while shiny, is going to result in a lot of 
redundancy in the tests- I strongly suspect such functionality will 
result in just building up a boatload of random snippets, many hitting 
the same code in the same way; while trying to brute force the test 
source is one way to test for bugs, actual tests written by hand (with 
an eye on the preexisting tests) is a bit saner imo.

Realize django devs take the stance that "doctest/unitest, whatever, 
it's tests", but not sure if I'd recommend that route.

Debugging tool, hey, kick ass; serialization bit, eh, if it scratches 
your itch...

~harring


pgpNzyF0KieqZ.pgp
Description: PGP signature


db backend refactoring

2007-08-13 Thread Brian Harring
As hinted at earlier on the ml, have started doing some work on 
refactoring the actual db backend; ticket 5106 holds the current 
version (http://code.djangoproject.com/ticket/5106).

Best to start with perceived cons of current design of the backends-

1) redundancy of code; each backend implementation has to implement 
the exact same functions repeatedly- _(commit|rollback|close) are 
simple examples, better examples are the debug cursor, dictfetch* 
assignments in each base module, repeat get_* func definitions in the 
base module.  Looks of it, each backend was roughly developed via 
copying an existing one over, and modifying it for the target backend- 
this obviously isn't grand (why fix a bug once when you can fix it in 
7 duplicate spots? ;)

2) due to the lack of any real base class/interface, devs are 
basically stuck grepping each backend to identify what functionality 
is available; track the usage of get_autoinc_sql in core/management 
for example, some spots protect themselves for the function missing, 
some spots assume it always exists (always exists best I can figure).  
Lack of real OOP for the backend code also means that django is 
slightly screwed in terms of trying to do changes to the backend- 
instead of adding a compatibility hack in one spot, have to add it to 
each/every backend.  Not fun.

3) reliance on globals; this one requires some explanation, and a 
minor backstory; mod_python spawns a new interpretter per vhost; if 
you have lots of vhosts in a worker/prefork setup, this means you 
bleed memory like a siev- not fun.  The solution (at least my 
approach) is to mangle the misc globals django relies on so that they 
are able to swap their settings on the fly per request (literally 
swapping the use $DJANGO_SETTINGS_MODULE/django.conf.settings._target), 
and to force mod_python to reuse the same interpretter.  Upshot of 
it, for our usage of it at curse-gaming, this means growing >400 
mb/process limited to 100 requests becomes ~40mb/process, unlimited 
requests (we have a veritable buttload of vhosts).  Assume min ~20 
idle workers, and you get an idea of why globals are more then a wee 
bit anti-scaling for a setup with a large # of vhosts.  

Getting back to db refactoring, reliance on globals through django 
code means that tricks like that are far harder, and adds more 
work for multidb code/attempts; that codebase require a reduction of 
global reliance (quote_name is a simple example- the quoting rules 
for mysql aren't the same as oracle/pgsql/sqlite, thus you need to 
get the quoter for the specific backend).  The old mantra about 
globals sucking basically is true; the access to misc backend 
functionality really needs to be grabbed via the actual backend 
object itself if there ever is an intention of supporting N backends 
w/in a single interprettor.

4) minor, but annoying; forced module layout means writing 
generated/new backends is tricky, further, you have to shove the 
backend into django.db.backends (the hardcoded location is addressable 
w/out the refactoring, although the layout issue would remain).


What I'm implementing/proposing;

1) shift access to introspection/creation/client module 
functionality to;

connection.introspection # literal attr based namespace
connection.creation # literall attr based namespace; realistically 
  # could shift DATA_TYPES to connection.DATA_TYPES and drop creation.
connection.runshell # func to execute the shell

2) shift access of base.* misc. bits into 5 attrs;

connection.DatabaseError  # should realistically be there anyways, and 
  # potentially accessible on the cursor object
connection.IntegrityError # same
connection.orm_map # base.OPERATOR_MAPPING
connection.ops # basically the misc get_*, quote_name, *_transaction, 
  # dict* methods floating in base; 
connection.capabilities # the misc bools django relies on to discern 
  # what sql to generate for the backend; allows_group_by_ordinals, 
  # allows_unique_and_pk, autoindexes_primary_key, etc

3) convert code over to accessing connection, instead of backend.  
Kind of a given this breaks the hell out of current consumers doing 
sql generation (moving quote_name for example), but the api breakage 
can be limited via adding a temporary __getattr__ to the base 
connection class that searches the new compartmentalized locations 
returning from there.  Not a good long term solution, but should be 
an effective intermediate band aid.


Basically, the pros of the approach are:

1) fix, or enable the next step in fixing of listed cons above in 
mainline django (instead of folks just forking off django with their 
needed changes).
2) actual base interface present for backend bits, making things less 
of a crapshoot when trying to write backend agnostic code.
3) connection reuse/pooling being able to be inlined in (or wrapped, 
implementation will be similar enough) for backends that lack it; fun 
example that came to mind was writing a simple wrapper to collect 
the 

Re: Adding hooks to methods that generate SQL in django/core/management.py

2007-08-12 Thread Brian Harring
On Sun, Aug 12, 2007 at 09:20:06AM -0400, George Vilches wrote:
> Something that I found a little irksome while working on this is that 
> the DatabaseWrapper class for each backend doesn't inherit from some 
> logical parent.  I know that all the db-specific functionality is 
> wrapped by these classes, but there are things that are in each class 
> that are most definitely shared functionality, like the mechanism by 
> which util.CursorDebugWrapper is instantiated.

Might want to take a look at ticket 5106; was working on connection 
pooling earlier this week, but ran into the same "wow, backend code 
needs some serious refactoring"- will push the patch later today, but 
basically have shifted everything so that construction, inspection, 
and client bits all are bound to the actual database wrapper instance 
itself- via that, can use a generic connection pooler that just wraps 
the real db wrapper.  You would be able to do the same thing I 
suspect- just pick out the correct method, and insert the logging there.


~harring


pgpuckveuhcDI.pgp
Description: PGP signature


Re: Maybe we need more triagers? Or something else?

2007-08-11 Thread Brian Harring
On Sat, Aug 11, 2007 at 08:22:55PM +0800, Russell Keith-Magee wrote:
> However, ultimately, I don't think that a technological solution will
> fix this problem. The voting app is a neat idea, but ultimately the
> problem isn't the voting - it's working out what to vote on in the
> first place, and once a decision has been made, finding time to fix
> the problems that have been decided upon.
> 
> From my perspective, Simon's idea of a weekly 'ticket roundup' is
> worth pursuing. Since the triage team is paying the closest attention
> to to the ticket stream, they are in the best position to be able to
> identify tickets with lots of activity or groups of tickets on a
> common theme. Filtering out this sort of information isn't easy to do
> with an automated system - it really requires the involvement of some
> gray matter.
> 
> A weekly email could serve as a useful focus for discussion - a
> regular commitment to look at a small number of pressing issues, and
> to stimulate discussion around the immediate plans for the project.
> Some of these decisions could also be fed back into the weekly status
> update. Working through the backlog might take a while, but the only
> way to attack the problem is a little bit at a time.

While reordering the bug flow is a good thing, I'm not really sure 
about the gain of it.  Basically, better tools/flow doesn't mean 
faster resolution/commits- just means it's organized better.  A weekly 
email *would* be nice, but view it as a stopgap approach to try and 
nudge folks to do something about tickets already queued up.

Personally the bottleneck I've noticed is for actually killing the 
ticket off- commiting.  Assuming that 'ready for checkin' literally 
means "commit the bugger", take a look at the # of tickets sitting in 
that stage.  Then dig into a few and see how *long* they've sat in 
that state.

To be clear, the intention of this email isn't "the commiters suck and 
need to commit my stuff"; it's more "there aren't enough people 
actually commiting stuff".  Via that viewpoint, you can see why I 
doubt tweaking the triaging is going to help stem ticket build up- 
it'll just make things a bit prettier/ordered, but I don't 
particularly see how that is going to turn the tides.

My two cents, for what it's worth ;)

~harring


pgpbZwsKetvwN.pgp
Description: PGP signature


Re: docstrings

2007-07-26 Thread Brian Harring
On Thu, Jul 26, 2007 at 07:19:01PM +0200, Nicola Larosa wrote:
> > People seem to forget that one of the key rules in any coding guidelines
> > is "do what the existing code does" (see, e.g., the second section of
> > PEP 8). Thus, our current standards are in not in conflict with PEP 8 or
> > PEP 257.
> 
> Are Django committers willing to accept patches that reformat lines within
> 80 characters?

Would personally be somewhat opposed to such patches; while I agree 
with 80 lines, reformatting code has a nasty nack of making it a 
serious pita to deal with patches from before the reformat point.

Why not just correct as we go?  In other words, if I'm already 
modifying this spot, do the reformat for the sections being modified?  
Yes, slightly extra noise on the patch, but at least the patch has a 
chance in hell of applying- after unicode, already have gone through a 
pretty heavy set of updates, not particularly looking forward to 
another couple of hours correcting for bitrot :)

~harring


pgpYPBeC6ngOR.pgp
Description: PGP signature


Re: Patch vs checkin

2007-07-14 Thread Brian Harring
On Fri, Jul 13, 2007 at 11:28:27PM -0500, Gary Wilson wrote:
> 
> Brian Harring wrote:
> > On Wed, Jul 11, 2007 at 09:38:54AM -0500, Deryck Hodge wrote:
> >> For example:
> >>
> >> mkdir repos
> >> cd repos
> >> bzr init
> >> svn co http://code.djangoproject.com/svn/django/trunk/ django_trunk
> >> bzr add django_trunk
> >> bzr commit
> >>
> >> Then you can bzr branch from there to have as many branches as needed.
> > Doing that makes merging a fairly major PITA: alternative-
> 
> What do you mean by this?  Merging from others' branches?  Merging to 
> svn repo?  I've been using bazaar lately for my work on Django, but have 
> only really used it on my local machine.

Pardon the delay, been rather busy.  If you're just importing vcs 
trunk in as a single rev in bzr, you're losing data along the way- 
specifically, losing each rev from the branch you're merging.  This 
limits the merging ability (and limits weave merging a bit), and makes 
things a pita if you're trying to identify why something changed.

Basically, keep the rev granularity- importing vcs head, and commiting 
that as a rev creates problems long term; simplest example is that if 
I try to merge your branch into mine, it can't easily tell which revs 
are shared between our branches on the trunk imports.  If it's the 
only option you have, sure, go that route, but personally I've found 
it winds up wasting my time :)

~harring


pgpCGhIjOwBbF.pgp
Description: PGP signature


Re: Proposal: customtags, a higher-level API for template tags

2007-07-14 Thread Brian Harring
On Fri, Jul 13, 2007 at 11:19:53PM -0500, Tom Tobin wrote:
> 
> On 7/12/07, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote:
> >
> > A few comments (mostly questions) from an initial reading:
> 
> 3) Regarding Brian Harring's #3453, the type conversion logic is
> redundant between his approach and mine.  taghelpers already avoids
> the issue he's trying to bypass, as it performs type conversion in one
> initial sweep and stores the literal values for later. I'd rather not
> use his approach in taghelpers; I don't like the idea of passing
> around LiteralVariables when I can be passing around
> honest-to-${deity} integers, strings, etc.  :-)

You actually *do* want to use my approach.  Note the "resolve_variable 
leads to redundant work" summary- the up front 'type conversion' 
(literal/constant conversion is a bit more accurate) check is done 
every time parsing occurs; that patch just converts parsing down 
to a one time cost.  The result of this is to speed up template 
variable lookup.

For your approach, normal usage of a node with args render triggers 
self.resolve_context, which in turn invokes module level 
resolve_context, which in turn invokes resolve_variable, which in 
turn does the conversion every render.

In other words, the purpose of my patch *still* exists- nuke the 
redundant reparsing of the var path.  Your patches implementation 
means this gain isn't possible, since the re-parsing is forced 
everytime (further, the basic algo of it is duplicated in at least 2 
spots from what I can tell).


> Ideally, I'd like to yank the type conversion logic out of
> resolve_variable, as I believe type conversion and variable resolution
> are really two separate things and should be available separately as
> needed.

#3453s patch already *does* this; two classes, LiteralValue (this is 
the value to shove in everytime this node is rendered), and 
ChainedLookupVariable (do a context lookup)...


Meanwhile, back to your basic patch... while this is interesting 
code, frankly I'm not seeing the gain in it.  Few reasons;

1) Gurantee you've not timed this sucker.  I suggest you do so, it's 
going to be a noticable hit to tag instantiation, and more importantly 
tag invocation- 2x increase on simple tags wouldn't surprise me.  
Reasoning pretty much comes down to the overhead involved, and the 
fact your base code tries to assign everything back to the node 
instead of just returning back to the render func.  If the node needs 
to preserve state, let its' render do it- don't mutate the node every 
run (it's slower, and opens up an angle for bugs).

2) Needless complexity.  Example: 

class SomeNode(TemplateTag):
  class Node:pass
  class Constants:
foo = 1
foo2 = 2

becomes

SomeNode.constants['foo'].  *cough* why not the following...

class SomeNode(TemplateTag):
  class Node:pass
  constants = {'foo':1, 'foo2':2}

Latter is the same, using normal pythonic constructs people know/love, 
and doesn't abuse classes as a forced indent (basically).


3) extension of #2; while Model does make *good* use of metaclass/sub 
classes shoved into a class, it's also far more complex.  This code 
could pretty easily have all of this shoved into __init__ of a target 
node, thus getting rid of the seperated parser func; instead, just 
register the class itself (which would handle the parsing and 
rendering)- fair bit simpler, via that approach, far more flexible 
also (tag author has total control of __init__, thus can do whatever 
they want).


4) Strongly doubt subclassing of TemplateTag nodes (as in, second 
level) is possible/works correctly, going by the arg_list bits- 
willing to bet a failed parsing of a node trashes that node classes 
ability to make new instances of itself also due to mangling of the 
class for each attempted instance generation.


Errors/issues I spotted follow; suspect there are more, but it's 2.5k 
lines, lot to dig through (that and the instantiation logic is _not_ 
simple to follow):

1) 'class Node:pass' from above is actually incorrect; 
'class Node(object):pass' is actually what's required if you're being 
careful, since your code tries to mixin Node, which is newstyle- you 
can get odd interactions mixing old style in with new, if one is new, 
all should be new.

2) TemplateTag.__new__ is uneeded.  Drop it.

3) use itervalues(), iteritems().  Seriously, there isn't any reason 
to use the others if you're just iterating over the mapping without 
mutating it.

4) 1729, identification of the 'validators'.  The code will misbehave 
with 'validate_spork_' and 'validate_spork'; dicts have no guranteed 
ordering, meaning that which validator would be used there can vary- 
thats ignoring the fact its allowing two validator methods to map to 
the same arg.

5) line 1890 of your patch, 'while self.arg_list'.  Pass the list into 
process_arg instead of having arg_list silently screw with 
self.arg_list on it's own; at least via passing it in, you know it may 
mutate it.  Not particularly sure 

Re: Proposal: QuerySet.exists() method

2007-07-13 Thread Brian Harring

On 7/13/07, Adrian Holovaty <[EMAIL PROTECTED]> wrote:
> On 7/13/07, SmileyChris <[EMAIL PROTECTED]> wrote:
> > Adrian, I think it's useful enough. But why do you need a .exists() if
> > we could just use __nonzero__ (like Ivan suggested)?
>
> I hadn't thought of that! Yes, we should definitely implement a
> QuerySet.__nonzero__() method. However, I'd like there to be an
> explicit method (named either exists() or any() or whatever), because
> it's easier to say this:
>
> To determine whether a QuerySet has at least one record, call its
> exists() method.
>
> ...than this:
>
> To determine whether a QuerySet has at least one record, put it in
> an "if" statement

tell them to do bool(QuerySetInstance) then; its basically what if is
doing anyways, and any decent python users will recognize the bool as
being redundant.

> It's convenient to be able to access the result of exists() inline,
> rather than having to use it in a certain context, such as within an
> "if" clause.  Sure, if we went with only __nonzero__(), you could call
> __nonzero__() directly, but accessing the Python double-underscore
> magic methods is slightly ugly.

And kills kittens (the protocol methods there usually should be
accessible by the paired builtin ;).

Either way, -1 on exists, +1 on nonzero.
~harring

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Patch vs checkin

2007-07-11 Thread Brian Harring
On Wed, Jul 11, 2007 at 09:38:54AM -0500, Deryck Hodge wrote:
> 
> On 7/11/07, Jacob Kaplan-Moss <[EMAIL PROTECTED]> wrote:
> > The best thing to do is host your changes remotely. That way there's
> > no implicit assumptions about if and when it'll get merged into
> > Django.
> >
> 
> Hi, Jacob.
> 
> I'll mention Bazaar, too.  My experience with it has been very pleasant.
> 
> I get a copy of Django from the official SVN, create a bzr repo off
> that SVN version as is (.svn files and all).  I then use bzr to track
> my changes, which still allows me to SVN up.  Merging is basically
> just an svn up away unless I touch a file that has also been updated,
> in which case I can use bzr to see what came from me instead of via
> SVN.
> 
> The UI for bzr is nearly identical to svn, so svn users should be able
> to use it with little learning curve.
> 
> For example:
> 
> mkdir repos
> cd repos
> bzr init
> svn co http://code.djangoproject.com/svn/django/trunk/ django_trunk
> bzr add django_trunk
> bzr commit
> 
> Then you can bzr branch from there to have as many branches as needed.
Doing that makes merging a fairly major PITA: alternative-

bzr branch http://pkgcore.org/~ferringb/django/trunk

using bzr-svn to proxy svn directly into bzr; usually is up to date 
within a day or so, although if folks beyond me use it I can cronjob 
it to be a bit more regular.

Only real downside I see to this route is that trac doesn't understand 
how to render bundles; so you wind up having to do bzr diff instead of 
just bundling the revs you want to throw at upstream.

~harring


pgpWPMOlgSCBX.pgp
Description: PGP signature


Re: Proposal: more granular select_related

2007-07-03 Thread Brian Harring
On Tue, Jul 03, 2007 at 07:34:55AM -, Johann C. Rocholl wrote:
> 
> Often I want to preload just some foreign keys, not all of them as
> select_related() does. Other people seem to need this feature too:
> http://groups.google.com/group/django-developers/browse_thread/thread/e5e0de59e8304bcd/bb93410289bc19b7
> http://code.djangoproject.com/ticket/3275
> 
> My solution: I have written a function that preloads only the
> requested foreign keys, and only for the objects that are already
> selected. My implementation is available here:
> http://trac.browsershots.org/browser/branches/shotserver-django/shotserver04/common/preload.py
> 
> My implementation has one advantage over the field parameter for
> select_related: you can pass a query set that you already have, and
> then you don't need any database queries again.

You might want to take a look at ticket 17 also, since it provides 
similar capabilities automatically- if that particular pk was already 
pulled, it grabs the instance in memory instead of hitting the 
backend again, and generating a new model instance.

~harring


pgpS7bjlXdCxO.pgp
Description: PGP signature


Re: template rendering as iteration, instead of layered concatenation

2007-06-22 Thread Brian Harring
On Fri, Jun 22, 2007 at 04:54:27PM +1000, Malcolm Tredinnick wrote:
> 
> On Thu, 2007-06-21 at 23:29 -0700, Brian Harring wrote:
> [...]
> > While it's not performance related, one additional argument against 
> > iter_render via James Bennett is that it makes third party code 
> > supplying their own Node derivatives a little tricky if they're trying 
> > to support svn and <=0.96 .
> 
> Why is this an issue? Those classes will implement render() and
> Node.iter_render() will call that when needed. They just ignore
> iter_render() without any harm at all. I thought that was the big
> advantage of your circular render() <-> iter_render() calls in Node: the
> API is backwards compatible.

External code calling the API is backwards compatible (render just 
invokes iter_render); implementations that are overriding behaviour 
of a Node derivative however have to override *both* as you pointed 
out if code is trying to target .96/svn, which kind of sucks.

It's addressable via including their own

def render(self, context):
  return ''.join(self.iter_render, context)

I'd prefer for folks who need that compatibility, that a metaclass be 
available they can include that handles it automatically- basically 
avoid the potential of a screwup.

Might be overkill however, thoughts definitely welcome on that one.


> > , I'd be +.5 on punting it out of 
> > mainline and into it's own branch- already have had enough bugs pop 
> > out that were pretty unexpected that I'm a fair bit worried about what
> > bugs are still uncovered.
> 
> Since this really isn't the problem I want to work on right now and
> since it is causing some destabilisation and changes block a bit based
> on somebody with commit access being ready to react, I'll back it out so
> that you can work on it in peace a bit more.

Works for me.  Ironing out the known issues shouldn't be too tricky, 
but I'm thinking the test coverage needs a bit of an increase before 
iter_render appears in trunk again (try to smoke out any remaining 
explosions prior to folks hitting them).

~harring


pgpX3bB6jDYx3.pgp
Description: PGP signature


Re: template rendering as iteration, instead of layered concatenation

2007-06-21 Thread Brian Harring
On Fri, Jun 22, 2007 at 09:47:57AM +1000, Malcolm Tredinnick wrote:
> 
> On Thu, 2007-06-21 at 10:16 -0700, Brian Harring wrote:
> > On Thu, Jun 21, 2007 at 08:33:05PM +1000, Malcolm Tredinnick wrote:
> > > 
> > > Brian,
> > > 
> > > On Thu, 2007-06-14 at 12:23 -0700, Brian Harring wrote:
> > > > Just filed ticket 4565, which basically converts template rendering 
> > > > away from "build a string within this node of subnode results, return 
> > > > it, wash rinse repeat", and into "yield each subnode chunk, and my 
> > > > data as it's available".
> > > [...]
> > > > * Far less abusive on memory; usual 'spanish inquisition' heavy 
> > > > test test (term coined by mtreddinick, but it works), reduction from 
> > > > 84m to 71m for usage at the end of rendering. What I find rather 
> > > > interesting about that reduction is that the resultant page is 6.5 
> > > > mb; the extra 7mb I'm actually not sure where the reduction comes 
> > > > from (suspect there is some copy idiocy somewhere forcing a new 
> > > > string)- alternatively, may just be intermediate data hanging around, 
> > > > since I've been working with this code in one form or another for 3 
> > > > months now, and still haven't figured out the 'why' for that diff.
> > > 
> > > When you were doing this memory checking, what server configuration were
> > > you using?
> > 
> > lighttpd/fcgi; the gain there is from what directly comes out of the 
> > templating subsystem (more specifically, HttpResponse gets an iterable 
> > instead of a string).  Gain there *should* be constant regardless of 
> > setup- exemption to this would be if your configuration touches 
> > .content, or prefers to collect the full response and send that 
> > instead of buffering- if your setup kills the iteration further up 
> > (and it's not middleware related), would be interested.
> > 
> > Nudge on ticket #4625 btw ;)
> 
> You can safely assume I'm aware of it. #4625 is way down the list,
> though; there are bigger problems. I'm currently trying to work through
> all the unexpected side-effects of the first change (in between being
> really busy with Real Life). If we don't come up with a solution to the
> hanging database connections today -- see relevant thread on the
> django-users list -- I'm going to back out the whole iterator change for
> a while until we can fix things properly.

Up to you regarding punting it; very least, I don't much like breaking 
folks setups, so a temp punt won't get complaints out of me.

Re: the new thread, not on that ml, although looks of it ought to be.  
Also, in the future if it *looks* like I had a hand in whatever 
borkage is afoot, don't hesitate to cc me- as you said, real life 
intervenes, so I'm not watching everything.  Rather have a few false 
positives email wise, then be missing something I may have broke.


> The reason for the original question was part of that. We might have to
> give back all the memory savings, since we need to know when use of the
> database is definitely finished. One solution that remains
> WSGI-compliant is to have the __call__ method return a single item
> iterable of all the content  which is basically a return to the old
> behaviour (slightly fewer string copies, but not many). That is how the
> mod_python handler works already ,in effect, so that's easier to fix.
> Alternatively, we have to introduce much tighter coupling between
> templates, backend and HTTP layer to manage the database connections
> correctly, which would be a nasty thing to have to do.
>
> 
> Making the change at the WSGI handler level is more efficient in terms
> of network traffic, since WSGI servers are not permitted to buffer
> iterated writes. Lots of short writes can be occurring with a compliant
> server at the moment. But if we do that, it's not clear it's enough of a
> win over the original code to warrant it.

Keep in mind I'm having a helluva time finding *exactly* whats 
required of a handler beyond __iter__/close, but a potential approach

from itertools import chain

class IterConsumedSignal(object):

  def __iter__(self):
return self

  def next(self):
dispatcher.send(signal=signals.request_finished)
raise StopIteration()

class ResponseWrapper(object):

  def __init__(self, response):
self._response = response
if hasattr(self._response, 'close'):
  self.close = self._response.close

  def __iter__(self):
assert self._response is not None
try:
  if isinstance(self._response, basestring):
return chain([self._response], IterConsumedSignal())
  return chain(self._respons

Re: Bug in template system?

2007-06-21 Thread Brian Harring
On Thu, Jun 21, 2007 at 04:19:07PM -0400, Mario Gonzalez wrote:
> 
>   Hello, today after an svn update I've got a
> 
> Traceback (most recent call last):
>   File "/usr/lib/python2.5/site-packages/django/core/servers/basehttp.py",
> line 273, in run
> self.finish_response()
>   File "/usr/lib/python2.5/site-packages/django/core/servers/basehttp.py",
> line 312, in finish_response
> self.write(data, False)
>   File "/usr/lib/python2.5/site-packages/django/core/servers/basehttp.py",
> line 396, in write
> self._write(data)
>   File "socket.py", line 262, in write
> self.flush()
>   File "socket.py", line 249, in flush
> self._sock.sendall(buffer)
> error: (32, 'Broken pipe')
> Exception django.template.TemplateSyntaxError:
> TemplateSyntaxError('Caught an exception while rendering: ',) in
>  ignored
> 
>   I'm try to find what the problem is but maybe you can help me a bit.
> 
>Ideas?

File a bug (cc me) including the template(s) involved please.

Getting the distinct vibe the unittests need a fair bit of expansion 
also :/

~harring


pgpRt5yFHZe0z.pgp
Description: PGP signature


Re: template rendering as iteration, instead of layered concatenation

2007-06-21 Thread Brian Harring
On Thu, Jun 21, 2007 at 08:33:05PM +1000, Malcolm Tredinnick wrote:
> 
> Brian,
> 
> On Thu, 2007-06-14 at 12:23 -0700, Brian Harring wrote:
> > Just filed ticket 4565, which basically converts template rendering 
> > away from "build a string within this node of subnode results, return 
> > it, wash rinse repeat", and into "yield each subnode chunk, and my 
> > data as it's available".
> [...]
> > * Far less abusive on memory; usual 'spanish inquisition' heavy 
> > test test (term coined by mtreddinick, but it works), reduction from 
> > 84m to 71m for usage at the end of rendering. What I find rather 
> > interesting about that reduction is that the resultant page is 6.5 
> > mb; the extra 7mb I'm actually not sure where the reduction comes 
> > from (suspect there is some copy idiocy somewhere forcing a new 
> > string)- alternatively, may just be intermediate data hanging around, 
> > since I've been working with this code in one form or another for 3 
> > months now, and still haven't figured out the 'why' for that diff.
> 
> When you were doing this memory checking, what server configuration were
> you using?

lighttpd/fcgi; the gain there is from what directly comes out of the 
templating subsystem (more specifically, HttpResponse gets an iterable 
instead of a string).  Gain there *should* be constant regardless of 
setup- exemption to this would be if your configuration touches 
.content, or prefers to collect the full response and send that 
instead of buffering- if your setup kills the iteration further up 
(and it's not middleware related), would be interested.

Nudge on ticket #4625 btw ;)

~harring


pgpYKSeWOUzx9.pgp
Description: PGP signature


Re: Fix for #4565 caused this code to break, not sure how to fix my code.

2007-06-19 Thread Brian Harring
On Tue, Jun 19, 2007 at 01:12:40PM -0600, Norman Harman wrote:
> 
> This change
> 
> r5482 | mtredinnick | 2007-06-17 00:11:37 -0700 (Sun, 17 Jun 2007) | 4 lines
> 
> Fixed #4565 -- Changed template rendering to use iterators, rather than
> creating large strings, as much as possible. This is all backwards 
> compatible.
> Thanks, Brian Harring.
> 
> Specifically the isinstance checks in/django/django/template/__init__.py
> 
> class NodeBase(type):
>  def __new__(cls, name, bases, attrs):
>  """
>  Ensures that either a 'render' or 'render_iter' method is 
> defined on
>  any Node sub-class. This avoids potential infinite loops at 
> runtime.
>  """
>  if not (isinstance(attrs.get('render'), types.FunctionType) or
>  isinstance(attrs.get('iter_render'), types.FunctionType)):
>  raise TypeError('Unable to create Node subclass without 
> either "rend
>  return type.__new__(cls, name, bases, attrs)
> 
> 
> Breaks this code, from 
> http://code.google.com/p/django-voting/wiki/RedditStyleVoting
> 
> Seems like the above check doesn't "see" the inherited method.  Based on 
> the fact that above test passes if I but a render method in the two 
> subclassess.   But I don't really grok new/metaclass stuff all that 
> supremely.  Any help/insight in how to fix this is appreciated.
> 
> from django import template
> 
> class BaseForObjectNode(template.Node):
>  def __init__(self, object, context_var):
>  self.object = object
>  self.context_var = context_var
> 
>  def render(self, context):
>  try:
>  object = template.resolve_variable(self.object, context)
>  except template.VariableDoesNotExist:
>  return ''
>  context[self.context_var] = self._func(object)
>  return ''
> 
> class ScoreForObjectNode(BaseForObjectNode):
>  _func = Vote.objects.get_score
> 
> class AverageForObjectNode(BaseForObjectNode):
>  _func = Vote.objects.get_average

See ticket 4625; ought to fix it.  Adding tests would be welcome also, 
since none seem to have been added :)

Thanks,
~harring


pgpNOFRT1sLgx.pgp
Description: PGP signature


Re: template rendering as iteration, instead of layered concatenation

2007-06-15 Thread Brian Harring
On Thu, Jun 14, 2007 at 06:16:22PM -0500, Jacob Kaplan-Moss wrote:
> 
> On 6/14/07, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote:
> > Whoops. :-(
> 
> Yeah, I got stuck on that, too :)
> 
> > To completely bulletproof this, you could add a metaclass to Node to
> > check that the Node subclass that is being created has a method called
> > either "render" or "iter_render" (or both). That's only run once each
> > import, so won't give back any of the savings in real terms.
> 
> Oh, that would work well.
> 
> Any reason not to spell ``iter_render`` as ``__iter__``?

__iter__ is used by iter, takes no args; iter_render (and render) 
however, take a single arg- context.

Plus __iter__ is already used for visiting each node of the tree ;)

~harring


pgpVsvLlFjtpL.pgp
Description: PGP signature


template rendering as iteration, instead of layered concatenation

2007-06-14 Thread Brian Harring
Just filed ticket 4565, which basically converts template rendering 
away from "build a string within this node of subnode results, return 
it, wash rinse repeat", and into "yield each subnode chunk, and my 
data as it's available".

The pros of it are following (copy/pasting from the ticket):

* instant results; as long as you don't have any tags/nodes that 
require collapsing their subtree down into a single string, data is 
sent out immediately. Rather nice if you have a slow page- you start 
getting chunks of the page immediately, rather then having to wait for 
the full 10s/whatever time for the page to render.

* Far less abusive on memory; usual 'spanish inquisition' heavy 
test test (term coined by mtreddinick, but it works), reduction from 
84m to 71m for usage at the end of rendering. What I find rather 
interesting about that reduction is that the resultant page is 6.5 
mb; the extra 7mb I'm actually not sure where the reduction comes 
from (suspect there is some copy idiocy somewhere forcing a new 
string)- alternatively, may just be intermediate data hanging around, 
since I've been working with this code in one form or another for 3 
months now, and still haven't figured out the 'why' for that diff.

* Surprisingly, at least for the test cases I have locally, I'm not 
seeing a noticable hit on small pages, nothing above background noise 
at the very least- larger pages, seeing a mild speed up (which was 
expected- 4-5%). 

* conversion over is simple- only potential API breakage is for Nodes 
that don't actually do anything, ie no render method.

The con of it is that the Node base class grows a potential gotcha;

class Node(object):
  def render(self, context):
return ''.join(self.iter_render(context))

  def iter_render(self, context):
return (self.render(context),)

As hinted at above, if a derivative (for whatever reason) doesn't 
override iter_render or render, it goes recursive and eventually 
results in a RuntimeError (exceeds max stack depth).

Have been playing with this patch in one form or another for last few 
months, and have been pretty happy with the results- what I'm 
specifically wondering is if folks are willing to accept the potential 
Node gotcha for the gains listed above.

Alternatively, if folks have an alternate solution, would love to hear 
it.

General thoughts, comments, etc, desired- thanks.
~harring


pgpgu8FwPBdVn.pgp
Description: PGP signature


Re: signals

2007-06-13 Thread Brian Harring
On Wed, Jun 13, 2007 at 09:37:35AM +1000, Malcolm Tredinnick wrote:
> On Tue, 2007-06-12 at 06:16 -0700, Brian Harring wrote:
> > Not yet advocating it (mainly since digging it out would be ugly), but 
> > if you take a look at the bits above, having the option to disable 
> > verification on read *would* have a nice kick in the pants for ORM 
> > object instantiation when the admin has decided the data is guranteed 
> > to be the correct types.
> 
> I have a blog post I'm still collecting the data for (making pretty
> graphs, mostly), but it seems that the extreme cases you can actually
> make instantiation much faster by just writing an __init__ method on the
> Model sub-class that does just the right thing -- customised for the
> model fields -- since it's only an attribute populator when you get
> right down to it. One can even generate this code automatically. Again,
> not something to worry about for the 90% case, but for people wanting to
> create 10^6 objects, it's worth the ten minutes to code it up by hand.
> That'll appear on the community aggregator when I post it.

Sounds semi fragile since presumably it will totally bypasses the 
default Model.__init__ (which later on may grow important instance 
initialization bits).

Personally, I'm more interested in speeding up django then bypassing 
chunks of it for speed- realize it's not always possible, but that's 
where my interests lie (upshot of it is that the common cases get a 
bit faster while the extremes become usable).  That said, others may 
be less stubborn and it may be useful to them :)


> > > Given that upstream pydispatcher isn't really being maintained, I don't
> > > think we should be too hesitant to tweak it for our needs.
> > 
> > Don't spose it could just be thrown out?  The code really *is* ugly :)
> 
> Agree with the second part (although I'd be less hyperbolic, if I saw it
> from a colleague, we'd be having discussions).

Humour, man, humour ;)

'Ugly' isn't hyperbole I'm afraid however.  Terms usage isn't meant 
to say anything about the author, merely that this code while 
working, needs some heavy cleanup, something a bit more 
elegant/maintainable.

Internals are pretty obscure, and no one seems to have firm 
expectations out of the code- for example, should the order of 
connect calls reflect the order of dispatches when a send comes in?  
Yet to find an answer to that one, both from django devs/users 
expectations, and searching the remains of upstream.

Part of the confusion about the code comes down to the age of the 
code- it originally aimed for py2.2 (since then requiring 2.3.3 due to 
python bugs).  Few examples, sets came around in 2.3, so the order of 
dispatch may not be desired to be constant, may just be an 
implementation quirk.

Same goes for the misc dicts floating in dispatcher; are they 
implemented that way for a reason, instead of using 
weakref.Weak*Dictionary?  The author of pydispatch as far as I know 
also implemented weakref.Weak*Dictionary (they've got it in their vcs 
at least)- assuming that's correct, either the author skipped updating 
pydispatch for Weak*Dictionary (either due to time/interest, or the 
updating being tricky), or there was an undocumented reason it wasn't 
used- basically have to assume the latter and try to root about for 
an undocumented reason, since weakref collections can be fairly 
tricky.

Either way, I'm still playing locally, but I strongly suspect I'll 
wind up chucking the main core of dispatch.dispatcher and folding it 
into per signal handling; may not pan out, but at this point it seems 
the logical next step for where I'm at patch wise.


> Keeping the API similar
> to what it is (or at least, "routine" to port -- possible to do with a
> reg-exp, say) would be worthwhile, since there is a lot of code in the
> wild using the signal infrastructure.

Don't have any intention of chucking the api at this point; will make 
more sense when the patch is posted, but basically I'm shifting the 
logic into individual signals instances- making them smarter.

So, for (unimplemented, but soon) simple signals, following example- 
instead of

dispatcher.send(signal=signals.request_started)

would be

signals.request_started.send()

Reasoning is two fold- 1) locks down the data thats passed to 
receivers, can figure out whats for that signal just via pydoc, 2) 
shifts control of actually dispatching into the signal, which can be 
made aware at the time of regirestration of whether or not anything is 
actually listening.

Expected by product of that shift is that the set of listeners for a 
signal be calculated when a listener is added, instead of doing a lookup 
each send invocation.

Nice aspect of this is that dispatcher.send api still can exist, just 
wind up delegating to the signal if it's new style, if not, fallback 
to the existing connections machi

Re: make model.save() take kw params

2007-06-12 Thread Brian Harring
On Wed, Jun 13, 2007 at 12:11:29AM +0530, Amit Upadhyay wrote:
>Hi,
>Wouldn't it be cool if we can say
>user.save(email="[EMAIL PROTECTED]"), which will do the
>equivalent of user.email = "[2] [EMAIL PROTECTED]"; user.save()?

Not really, no. :)

Save is simple; why make it more complex?  Could just as easily add a 
utility function that does this.  Yes, would be two func calls- but 
personally, I don't see the gain in overloading save to do more then 
push the current (keyword there is 'current') state of the instance 
into the backend.

Additional thought; it would make disabling/enabling sending of 
signals out of save not possible- the update would have to occur prior 
to the signal being sent, thus couldn't just wrap the save method on 
the fly.

Either way, -1 personally since I just don't see the gain in making 
save more complex.

~harring


pgpeI7dFBROon.pgp
Description: PGP signature


Re: signals

2007-06-12 Thread Brian Harring
On Mon, Jun 11, 2007 at 07:39:08PM +1000, Malcolm Tredinnick wrote:
> 
> On Sun, 2007-06-10 at 09:07 -0700, Brian Harring wrote:
> > Curious, how many folks are actually using dispatch at all?
> > 
> > For my personal usage, I'm actually not using any of the hooks- I 
> > suspect most folks aren't either.  That said, I'm paying a fairly 
> > hefty price for them.
> > 
> > With Model.__init__'s send left in for 53.3k record instantiation 
> > (just a walk of the records), time required is 9.2s.  Without the 
> > send, takes 7.0s.  Personally, I'd like to get that quarter of the 
> > time slice back. :)
> 
> Since you already have your own version of the Spanish Inquisition set
> up for testing, what portion of this overhead is just the function call?
> If you the dispatch function is replaced with just "return", do we save
> much.

Offhand, replacing the dispatch with just 'return' is actually 
semi tricky, since there are a few receivers required for the django 
internals (class preparation).  Basically requires delegating the send 
to the signal in select cases (for *_delete, and request_*, don't see 
much option unless they can be shifted around also).

For __init__ and save however, the wrap trick will fly- meaning don't 
even need the empty function call.

Either way, profile dump follows.

Top 30 via lsprof (cProfile for 2.5); with send left in Model.__init__

>>> ps.sort_stats("ti").print_stats(30)
Mon Jun 11 02:55:18 2007dump.stats

 1747388 function calls (1745991 primitive calls) in 18.627 CPU seconds

   Ordered by: internal time
   List reduced from 916 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
533323.3140.0009.0460.000 base.py:97(__init__)
   1067202.5920.0003.1620.000 dispatcher.py:271(getAllReceivers)
   3733181.9950.0003.0560.000 base.py:38(utf8)
21.7230.8611.7230.861 base.py:99(execute)
533321.6310.0004.6870.000 base.py:37(utf8rowFactory)
  5371.5400.0036.2270.012 ~:0()
   3803481.3290.0001.3290.000 ~:0()
   1969501.0610.0001.0610.000 ~:0()
533340.8090.000   17.9580.000 query.py:171(iterator)
   1066780.8080.0003.9800.000 dispatcher.py:317(send)
   2133740.5700.0000.5700.000 ~:0()
116228/1159810.3210.0000.3220.000 ~:0()
30.1680.056   18.1266.042 query.py:468(_get_data)
533340.1620.0000.1620.000 ~:0()
   600.0600.0010.1180.002 functional.py:26(__init__)
   245/600.0420.0000.1580.003 sre_parse.py:374(_parse)
 66600.0360.0000.0360.000 functional.py:36(__promise__)
 31180.0350.0000.0510.000 sre_parse.py:182(__next)
   458/560.0320.0000.1260.002 sre_compile.py:27(_compile)
10.0270.0270.0320.032 
sre_compile.py:296(_optimize_unicode)
  2060.0220.0000.0670.000 
sre_compile.py:202(_optimize_charset)
70.0210.0030.4550.065 __init__.py:1(?)
 63600.0170.0000.0170.000 ~:0()
 25730.0170.0000.0590.000 sre_parse.py:201(get)
  634/2430.0140.0000.0180.000 sre_parse.py:140(getwidth)
10.0140.014   18.627   18.627 full-run.py:2(?)
19/120.0120.0010.1870.016 ~:0(<__import__>)
10.0080.0080.0110.011 socket.py:43(?)
10.0070.0070.1090.109 urllib2.py:71(?)
   182/570.0070.0000.1600.003 sre_parse.py:301(_parse_sub)




without

>>> ps.sort_stats("ti").print_stats(30)
Mon Jun 11 03:02:10 2007/home/bharring/dump2.stats

 1320732 function calls (1319335 primitive calls) in 13.970 CPU seconds

   Ordered by: internal time
   List reduced from 916 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
533322.4280.0004.4750.000 base.py:97(__init__)
   3733182.0040.0003.0590.000 base.py:38(utf8)
21.7150.8581.7160.858 base.py:99(execute)
   3803481.6230.0001.6230.000 ~:0()
533321.6170.0004.6760.000 base.py:37(utf8rowFactory)
  5371.5380.0036.2150.012 ~:0()
   1969501.0550.0001.0560.000 ~:0()
533340.7440.000   13.3050.000 query.py:171(iterator)
116228/1159810.3150.0000.3170.000 ~:0()
30.1660.055   13.4704.490 query.py:468(_get_data)
533340.1580.0000.1580.000 ~:0()
   600.0600.0010.1190.002 functional.py:26(__init__)
   245/600.0420.0000.1580.003 sre_parse.py

Re: Ping: Extend "for" tag to allow unpacking of lists

2007-06-10 Thread Brian Harring
On Fri, Jun 08, 2007 at 11:05:01PM -, SmileyChris wrote:
> 
> On Jun 9, 9:32 am, Brian Harring <[EMAIL PROTECTED]> wrote:
> > Realize it's hit trunk already,
> No worries, we can always open another ticket :)

Already watching enough tickets ;)


> > but noticed an annoying potential
> > gotcha- for single var in a for loop, any changes to the context stay
> > put- for multiple vars, the context gets wiped.
> Dang, I knew there must have been some logic in why I was handling the
> context originally - this was the case I had thought of then promptly
> forgotten about.
> 
> You're right. This is a change in behaviour between a single and
> multiple keywords and probably should be addressed.
> This would mean changing the context.update behaviour like mentioned
> in the other ticket in this thread and and making sure that any
> keywords not used this loop get popped out.

Related to that, context.(push|pop) really ought to return the newly 
(add|remov)ed scope- if you have direct access to the underlying scope 
object, you can safely bypass the context stack object and abuse dict 
methods directly (its update, clear, etc specifically).  'Safely' in 
this case means "without having to know the internals of Context".

Two upshots to it, you avoid involving Context.__.*item__ unless you 
specifically need it, more importantly it's a mild api tweak that 
is backwards compatible and removes the need to change the 
Context.update behaviour you're proposing.

For consistancy, if they were changed Context.update ought to return 
the newly added scope, but again, it's backwards compatible- 
if anyone was already relying on Context.(push|pop|update) returning 
None, well, they're special and need to be wedgied for relying on 
basically a void return :)


> > So... consistancy would be wise; which should it be?

Replying to myself since I've had some time to think it through, yet 
to see any code that relies on the parent preserving random mappings 
in the context- thinking wiping the context per iteration might be wise.  

Tags shouldn't really be leaving random cruft in the parent context, 
thus cleansing the owned scope per iteration avoids the potential 
for screwups.  Also has the benefit of simplifying ForNode.render 
while making it scale better (context.pop is a popleft on a list- 
meaning deeper the context stack, performance is linearly worse).


> > Also, use izip instead of zip dang it :P
> Gosh, how many keyword arguments are you trying to use? ;)
> Does it really matter to use izip if it's going to usually be less
> than 3 iterations?

izip vs zip in this particular case is a rough 20% gain in izips 
favor for < 10 args raw izip/zip speed; honestly it's a background 
gain, it's a good habit to have however for dict bits (plus it was a 
joke, the real improvement is suggested above).

Had a nice funny rant typed up related to the general lack of code 
efficiency in template code (ticket 4523 is an example), but instead 
just going to point out that adding new good habits to your 
repertoire isn't a bad thing.  Knowing stuff like above is actually a 
*good* thing- no different then knowing that

if len(sequence):
  pass

is daft- the only time that's sane is if sequence.__nonzero__ isn't 
properly defined (which means it should be properly defined since it 
goes to the trouble of defining sequence.__len__ already- hacking 
around the lack of __nonzero__ via len is fugly).  Better habit that 
avoids the extra func calls and extra object is-

if sequence:
  pass

As said, good habits.  Simpler code, more often then not, faster.  
Using izip instead of zip for dict updates/creation is pretty much 
always a win (even on a single item), thus a similar 'good habit'.

So as I said, "use izip instead of zip dang it :P".  Good habits such 
as that helps to avoid "death by a thousand cuts" syndrome in your 
code.

Lecture aside, opinions on the proposed scope change for 
ForNode.render from above is desired.

~harring


pgpPUctxrZRNo.pgp
Description: PGP signature


signals

2007-06-10 Thread Brian Harring
Curious, how many folks are actually using dispatch at all?

For my personal usage, I'm actually not using any of the hooks- I 
suspect most folks aren't either.  That said, I'm paying a fairly 
hefty price for them.

With Model.__init__'s send left in for 53.3k record instantiation 
(just a walk of the records), time required is 9.2s.  Without the 
send, takes 7.0s.  Personally, I'd like to get that quarter of the 
time slice back. :)

Via ticket 3439, I've already gone after dispatch to try and speed it 
up- I probably can wring a bit more speed out of it, but the 
improvements won't come anywhere near reclaiming the 2.2s from above.

What is left, is flat out removing the send invocations if they're 
not needed- specifically shifting the send calls out of __init__, and 
wrapping __init__ on the fly *when* something tries to connect to it.

Effectively,

from django.dispatch.dispatcher import connect, disconnect
from django.db.model.signals import pre_init
from django.db.models import Model

class m(Model): pass

callback = lambda *a, **kw:None

assert m.__init__ is Model.__init__
connect(callback, sender=m, signal=pre_init)
assert m.__init__ is not Model.__init__

disconnect(callback, sender=m, signal=pre_init)
assert m.__init__ is Model.__init__

The pro of this is that the slowdown is limited to *only* the 
instances where something is known to be listening- listening to model 
class foo doesn't slow down class bar basically; listening to save 
doesn't slow down __init__, etc.

The cons I'll enumerate:

1) to do this requires a few tricks- specifically, wrapping methods on 
a class on the fly when something starts listening, and reversing the 
wrapping when nobody is listening anymore.  Personally, I'm 
comfortable with this (the misc contribute_to_class crap going on in 
_meta already isn't too far off).  Realize however others may not be 
comfortable with it- thus speak up please.

2) usage of Any for a sender means we have to track the potential 
senders.

3) usage of Any for a signal means we have to track the signals 
involved in this trick (registration of the signal instance), and #2.

4) Not strictly required, but if sender is a class (and the only 
listeners are listening to *that* class, not Any), any deriving 
from that class will still fire the signal- meaning the performance 
gain is lost for the derivative, sends are occuring that don't have 
any listeners.  This can be reclaimed via some tricks in 
ModelBase.__new__ offhand if desired and the use scenario is at least 
semi-common (how many people derive from a defined Model?).

5) the wrapping trick introduces an extra func into the callpath when 
something is listening.  That's basically a semi-ellusive way of 
saying "it'll be slightly slower when there is a listener then what's 
in place now"; haven't finished the implementation thus I don't have 
specifics, but figure a few usecs hit from the wrapper itself (since 
the codepaths are executed often enough, it's worth noting for cases 
where listeners are expected).

Would appreciate any thoughts on above; the cons are basically 
implementation specific, that said I can work through them (I want 
that 25% back, damn it ;)- question is if folks are game for it or 
not, if the idea is palatable or not.


Aside from that, would really help if I had a clue what folks are 
actually using dispatch for with django- which signals, common 
patterns for implementing their own signals (pre/post I'd assume?), 
common signals they're listening to, etc.

Knowing it would help with optimizing dispatch further, and would be 
useful if someone ever decides to gut dispatch and refactor the code 
into something less fugly.

~harring


pgpfeiKPWGHvH.pgp
Description: PGP signature


Re: Ping: Extend "for" tag to allow unpacking of lists

2007-06-08 Thread Brian Harring
On Thu, Jun 07, 2007 at 09:53:07PM +0800, Russell Keith-Magee wrote:
> 
> On 6/6/07, SmileyChris <[EMAIL PROTECTED]> wrote:
> >
> > New patch uploaded, taking into account Brian's thoughts (and fixing a
> > small bug):
> >
> > http://code.djangoproject.com/ticket/3523#comment:11
> 
> Ok; I've had a chance to look at this now. I've made a couple of minor
> modifications, mostly in docs, and to make the parser more forgiving
> on whitespace around the separating commas.
> 
> Unless there are any objections, I'm happy to check this into trunk.

Realize it's hit trunk already, but noticed an annoying potential 
gotcha- for single var in a for loop, any changes to the context stay 
put- for multiple vars, the context gets wiped.

Basically, template equiv of
for x in xrange(10):
  if x:
print y
  y = 'foon'

behaves, while

for x,z in enumerate(xrange(10)):
  if x:
print y
  y = 'foon'
  
would cleanse the local scope each iteration, leading to (at best) 
string if invalid, worst, pukage.

So... consistancy would be wise; which should it be?  Also, use izip 
instead of zip dang it :P
~harring


pgp6R4h58md18.pgp
Description: PGP signature


Re: Ping: Extend "for" tag to allow unpacking of lists

2007-06-05 Thread Brian Harring
Personally, I'd like the patch to go in; the support is far easier on 
the eyes reading wise, and it has the nice side affect of speeding up 
the access- index access is last in the chain for var resolution, 
meaning several pricey exceptions/tests occur every access.

Faster resolution, and cleaner to read.

On Tue, Jun 05, 2007 at 02:34:42AM -, SmileyChris wrote:
> 
> Ticket http://code.djangoproject.com/ticket/3523 has been sitting

Comments on the patch/syntax at least; Offhand, test 
'for-tag-unpack03' points out some uglyness in the api-

{% for key value in items %}
{{ key }}:{{ value }}/
{% endfor %}

given ((1,2), (3,4)) yields

'1:2/3;4'

So which is it?  Comma delimted, or space?  Realize the parser 
forces args coming in split on whitespace, but I don't see the point 
in supporting two different delimiters for the 'var names' chunk of 
the syntax.

Prefer ',' personally (easier to pick out, among other things), but 
one or the other; KISS mainly.  Aside from that,

{% for pythons,allowed,trailing,comma,is,not,worth,preserving, in items %}
..super happy template crap...
{% endfor %}

Again, imo, isn't wise.  Technically, you can shove in

{% for x,,y in foon %}{% endfor %}

So the parsing there (the 'filter(None, ...)' specifically), is too 
loose imo.

Either way, semantics there differ from python; ','  serves as a 
marker (basically) to tuple'ize the surrounding bits-

>>> for x in [(1,),(2,)]:print x
(1,)
(2,)

is different from the rather nifty deref abuse of

>>> for x, in [(1,), (2,)]:print x
1
2

Not sure if supporting the trailing comma deref trick is 
worthwhile; either way patch up the loose commas handling and it's no 
longer valid syntax, so the issue can be ignored till someone 
actually pushes for adding the equiv trick into the template syntax.

Use itertools.izip also (you don't need the intermediate list, thus 
don't allocate one).

> The ticket itself relies on a change to context.update, which is
> probably why it has been delayed but my latest crack at that makes
> more sense (http://code.djangoproject.com/ticket/3529).

Technically the change in 3529 should support dict.update's "I take 
either a sequence of len 2 tuples, or a dict" also- meaning if update 
is going to be made closer to dict behavior, the intermediate dict 
isn't needed (straight izip in that case).

~harring


pgp8I5QpnzBTx.pgp
Description: PGP signature


Re: Buildbot?

2007-04-18 Thread Brian Harring
On Wed, Apr 18, 2007 at 11:16:56AM -0500, Jacob Kaplan-Moss wrote:
> 
> On 4/18/07, Jonathan Daugherty <[EMAIL PROTECTED]> wrote:
> > Do you have a resident buildbot?  That could be used to run the
> > regression tests on (all pythons) x (all databases).  It would still
> > take time, of course, but it could at least be automated[1][2].
> 
> I've spent some time in the past trying to get one set up (I've used
> both the buildbot and Bitten, a Trac plugin), but I've not been able
> to figure out how to get it working with Django's custom test harness.

What exactly was the issue you were having?  Ofhand, a ShellCommand 
with haltOnFailure=True ought to suffice- last I looked, y'alls test 
runner properly set the exit code if a failure was detected, thus you 
should be able to rely on that.

~harring


pgpF58MrtEh4g.pgp
Description: PGP signature


Re: Upcoming Django release, and the future

2007-02-26 Thread Brian Harring
On Mon, Feb 26, 2007 at 12:04:29AM -0600, Jacob Kaplan-Moss wrote:
> 
> On 2/25/07, Brian Harring <[EMAIL PROTECTED]> wrote:
> > http://code.djangoproject.com/ticket/3440 , Models.__init__
> > refactoring.
> 
> #3440 is fixed and has to do with DateQuerySet; did you mean #3438?

Pardon, 3438 is the correct one.

~harring


pgpFH8rs51kDM.pgp
Description: PGP signature


Re: Upcoming Django release, and the future

2007-02-25 Thread Brian Harring
On Sun, Feb 25, 2007 at 10:56:36PM -0600, James Bennett wrote:
> If there's a bug that's been annoying the heck out of you and you want
> it fixed before the release, this would be the time to speak up about
> it.

http://code.djangoproject.com/ticket/3440 , Models.__init__ 
refactoring.

Fair amount of models instantiation is from data pulled from the 
backend, meaning it's passed in via positional args- the current 
__init__ however is written more for instantiation via passing args in 
through kwargs.

Due to this, for instantiating a model derivative for a record (fairly 
common case) a lot of extra work is done that isn't needed- very 
least, fair slowdown due to exception throwing/catching.

Patch in that ticket reorganizes the processing so that the same 
behaviour is preserved, but processing starts with positional args 
first- 33% reduction in runtime via the reordering.

If at all possible, I'd like to see that patch integrated for 0.96, 
since behaviour is the same but it provides a rather nice boost for 
decent sized queryset operations.

Thanks,
~harring


pgp0b8SroctGg.pgp
Description: PGP signature


Re: A couple tickets that should get some discussion

2007-02-06 Thread Brian Harring
On Tue, Feb 06, 2007 at 08:05:16AM -0600, Jacob Kaplan-Moss wrote:
> 
> On 2/5/07 10:39 PM, James Bennett wrote:
> > http://code.djangoproject.com/ticket/3439 -- Improving the
> > dispatcher's performance (why *are* we still using such an old version
> > of PyDispatcher? Did it just get forgotten deep down in the code?)
> 
> I'm not entirely sure, actually...
> 
> Can someone investigate swapping in the 
> latest pydispatcher and compare it against Brian's patch?

Poked at pydispatcher 2.0; upshot, they grew some tests since 1.0.  

Downside, the 2.0 release doesn't even build due to tests being 
left out of the release- said issue has been in v2 since its release 
(july), a bug was filed in august about it, and still no resolution.  

Not too confident about upstream being responsive/handling issues, in 
other words.  Test coverage is pretty basic also- no assertion in terms of 
callback ordering for example (so the f1/f2 question from the ticket 
remains), no send coverage, etc.

The actual v1 to v2 changes, ignoring the broken release, are 
inclusion of a basic test suite and whitespace changes- no real code 
changes, thus the performance issues are still there.


> I'd prefer to track pydispatcher instead of going out on our own, but I'm not 
> sure about the ramifications of that.

Personally, suspect y'all are already on your own from a maintenance 
standpoint- as said, upstream looks to have shutdown.  Either way, 
updated the patch to correct a bug upstreams test suite spotted, and 
folded upstreams limited test suite in.

Regardless of if the optimizations go in you probably want to lift the 
test suite integration.

~harring


pgp46QlCq00GT.pgp
Description: PGP signature