Re: Django NoSQL support status?

2011-08-12 Thread Mike Malone
I've been away from these efforts for a while, but someone interested in
bridging SQL/NoSQL in Django might find these links interesting:

http://msdn.microsoft.com/library/bb308959.aspx
http://cacm.acm.org/magazines/2011/4/106584-a-co-relational-model-of-data-for-large-shared-data-banks/fulltext
http://unqlspec.org/display/UnQL/Syntax+Summary


On Thu, Aug 11, 2011 at 8:51 PM, Russell Keith-Magee <
russ...@keith-magee.com> wrote:

> On Fri, Aug 12, 2011 at 11:25 AM, Nathan Hoad 
> wrote:
> > Hi guys, just wondering what the status is on the official NoSQL
> > support? I've done a lot of reading and the last official post I can
> > see is Alex Gaynor's GSoC post, from August last year mentioning the
> > Query refactor and how it should help down the road for NoSQL, but
> > nothing past that.
>
> There are two efforts in play here:
>
> 1) The "official" effort, that was largely worked on during the GSoC last
> year
>
> 2) The "unofficial" effort, largely the work of the guys at All
> Buttons Pressed. See
> http://www.allbuttonspressed.com/projects/django-nonrel for details.
>
> (I use "official" in quotes because I don't want to imply that the
> "unofficial" port is broken in some way or that the "official" port is
> better -- just that the "unofficial" version hasn't been developed
> without any particular core team involvement. The reported feature set
> of the "unofficial" branch is certainly more advanced than the
> "official" branch)
>
> The progress on these two efforts (from the perspective of the Django
> core) has been stalled, for different reasons.
>
> The "official" effort has stalled because Alex hasn't been actively
> pursuing the branch; there are some good ideas in the branch, but
> there are also a couple of big design issues that need to be made
> before the branch would be a candidate for merging (such as how to
> deal with non-integer autofields). See the mailing list discussions
> for more details on these problems.
>
> The "unofficial" effort is being actively worked on (AFAIK), but the
> impediment on getting it near core is a lack of external review.
> Despite repeated calls for independent review of the internals, none
> has been forthcoming.
>
> Speaking personally: I've only had one report from someone I trust
> that has used django-nonrel; and I haven't seen a good analysis of
> "how it does what it does". NoSQL isn't a huge priority for me
> personally, and so addressing other problems has taken a priority. I
> think it's safe to assume the same is true of the other core
> developers.
>
> > Basically, I'm just wondering what, if anything, is being done about
> > it?
>
> Is there a core developer actively working on the topic right now? Not
> to my knowledge.
>
> Is the core team interested in getting NoSQL support into trunk?
> Absolutely -- if it can be done without compromising the rest of
> Django's ORM, and if members of the community can work together to
> either:
>  1) resolve the issues in the "official" branch, or
>  2) build the social trust around the "unofficial" branch to show that
> it warrants inclusion in trunk.
>
> Yours
> Russ Magee %-)
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To post to this group, send email to django-developers@googlegroups.com.
> To unsubscribe from this group, send email to
> django-developers+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Purpose of constant_time_compare?

2010-12-09 Thread Mike Malone
This is quickly becoming off topic, but I'll bite ;D.

On Wed, Dec 8, 2010 at 10:52 PM, Gabriel Hurley  wrote:

> You wanna hand over your paycheck now, or later? :-)
>
> I know someone with a functional white-hat timing attack script sitting on
> their laptop. They've been honing the statistical analysis to get the number
> of data points needed down to a less noticeable size, but the technique can
> already be successfully applied.
>

Pics or it didn't happen.

If you can show me a viable timing attack, over the Internet, under
reasonable real-world circumstances, and caused by something as negligible
as a single string comparison I will give you my paycheck. And I will eat my
laptop.


> To your latter point, you can run a timing attack as slowly as you like,
> and a lot of sites have very poor monitoring for things like 404s. A month
> or more of patient low-level attacking to gain access to a prime target is
> well worth it.
>

The longer you draw out the attack the less consistent the results. Code
changes, hardware changes, data set sizes change, passwords change, BGP
routes change, peering agreements change, phases of the moon change, etc.

If you can tune your web app to the point where response time variance is
small enough to notice a couple dozen CPU cycles of variance, and can
maintain that sort of consistency over an extended period of time, either
you're not doing anything interesting, you're running a Commodore 32, or
you're my new hero.


> The point being that we all ought to take timing attacks seriously. They're
> not nearly as unrealistic as people think.
>

Sure, broadly speaking they're an attack vector. In this particular scenario
it's silly wankery by smart people who put up with the same sort of silly
wankery from me sometimes. So whatever.

<3,

Mike

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Purpose of constant_time_compare?

2010-12-08 Thread Mike Malone
Yea... in reality I'd bet my paycheck that the answer is no. Despite Coda's
blog post, you can't use the jitter in HTTP requests to gain any insight
into where a string match fails.

Even if you could do so with hundreds of requests, it's fairly obvious that
an attack is taking place when you get that many bad requests for one
account.

Mike

On Wed, Dec 8, 2010 at 12:10 PM, Alex Gaynor  wrote:

>
>
> On Wed, Dec 8, 2010 at 3:08 PM, Jonas H.  wrote:
>
>> Hello out there,
>>
>> what is the point of `django.utils.crypto.constant_time_compare`? I
>> understand it takes O(n) time no matter what input it is feeded with, but of
>> what avail is it?
>>
>> Can the time spent in *one single string comparison* really make such a
>> huge difference?
>>
>> Confused,
>> Jonas
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Django developers" group.
>> To post to this group, send email to django-develop...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> django-developers+unsubscr...@googlegroups.com
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/django-developers?hl=en.
>>
>>
> In theory, yes.  These are a class of attacks known as timing attacks:
> http://en.wikipedia.org/wiki/Timing_attack.  That said I don't know of any
> actual real world attacks using these, but better safe than sorry.
>
> Alex
>
> --
> "I disapprove of what you say, but I will defend to the death your right to
> say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
> "The people's good is the highest law." -- Cicero
> "Code can always be simpler than you think, but never as simple as you
> want" -- Me
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To post to this group, send email to django-develop...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-developers+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Application to update the Test Suite

2010-04-10 Thread Mike Malone
On Fri, Apr 9, 2010 at 5:29 PM, Gabriel Hurley  wrote:
> Maybe it's an overly simplistic question, but: what makes the tests
> slow currently? It's not simply the volume of them. It's more than
> possible for Python to race through hundreds of tests per second under
> the right conditions.

Tests are slow largely because you have to load fixtures into the
database and roll them back for every test case. This is required to
guarantee a consistent initial state for each test. Speeding tests up
would be great, but a lot of work has already been done in this area
(e.g., tests are performed in a transaction that's rolled back in
teardown, if possible, since that's faster then dropping and reloading
each table) so I'm not sure how much low-hanging fruit is left. Keep
in mind that it's better to have slow tests that do the right thing
than fast ones that don't. Of course fast tests that do the right
thing would be ideal.

Mike

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Deprecating cmemcache, adding pylibmc

2010-02-22 Thread Mike Malone
> At this point in the release process, I'm not sure we can do
> everything that's being talked about in this thread. Given that we're
> feature-frozen and that there's no way we can spring a completely new
> cache backend on people at the last minute, here's what's possible
> within our release process right now for 1.2:
>
> 1. Specifying memcached as the cache backend continues using the same
> "memcached://" scheme as it always has. There is no way we can change
> that in the 1.2 timeframe.
>
> 2. The memcached backend in Django should look first for the correct
> library, and fall back to the old one as needed.
>
> 3. When falling back to the old memcached library, 1.2 should raise
> PendingDeprecationWarning; 1.3 should promote that to a
> DeprecationWarning and 1.4 should remove the support entirely.
>
> Anything and everything else being discussed is out of scope for 1.2
> and must wait for the 1.3 feature proposal window.

Yea, at this point doing anything significant in 1.2 seems like a bad
idea. But, some of this stuff could potentially be done using query
string args to the cache backend URI, which would maintain backwards
compatibility. Without thinking too much about naming the arguments,
it'd look something like this:

CACHE_BACKEND =
"memcached://127.0.0.1:11211/?binding=pylibmc_compression=True_compress_length=15_zero_timeout=True"

Still, I'm hesitant to rush into this sort of decision because I'd
hate to make the wrong one and then have to support it going forward.

Also, since the URI isn't known at the module level (it's passed into
CacheClass.__init__) we'd probably have to import the memcache client
in __init__ or something. Not the end of the world, but kind of messy.

> Mike's http://github.com/mmalone/django-caching has some good examples of 
> fixing 2/3

FWIW, the "more correct" way to do this is to create your own custom
cache backend like this: http://gist.github.com/299905 then use the
full module path in the CACHE_BACKEND setting, like:

CACHE_BACKEND = 'foo.bar.custom_memcache://127.0.0.1:11211/'

Mike

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: What do people think about the get_absolute_url proposal?

2009-12-16 Thread Mike Malone
On Wed, Dec 16, 2009 at 11:10 AM, Alex Gaynor <alex.gay...@gmail.com> wrote:
> On Wed, Dec 16, 2009 at 2:02 PM, Mike Malone <mjmal...@gmail.com> wrote:
>>> The way i see it (which may be wrong), this is not a proposal to make
>>> the request object global or replace/refactor the contrib.site app. In
>>> fact, some of the use cases mentioned strike me as things that would
>>> require overriding the default get_absolute_url method anyway. It
>>> seems to me like everyone is arguing over things outside the scope of
>>> this proposal.
>>
>> Actually the fundamental disagreement (as far as I can tell) is over
>> whether the absolute URL should be built using information pulled from
>> various application settings (Site module, settings file, etc) or
>> information pulled from the currently active request.
>>
>> In my opinion, using the Site module and settings files is damn
>> annoying. I never use the Site module, and in my experience having to
>> change the "FRONTEND_URL" of your app every time you push to a
>> different environment is tedious and a frequent source of subtle
>> problems. Moreover, the request information in your current request
>> _should_ always be correct. If someone requests a non-canonical URL
>> you should redirect (CNAME, 301, etc) to the canonical URL. If your
>> load balancer isn't sending the correct headers then the load balancer
>> is broken, not Django.
>>
>> That said, it sounds like there are a number of special cases where it
>> would be useful to override these settings. So maybe the best corse of
>> action is to try to use the configured Site information and fall back
>> on "RequestSite", which uses information from the currently active
>> request.
>>
>> Thoughts?
>>
>> Mike
>>
>> --
>>
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers" group.
>> To post to this group, send email to django-develop...@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> django-developers+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/django-developers?hl=en.
>>
>>
>>
>
> I am very strongly against making the request a thread local.  We have
> used thread locals in a few places (urlconf and i18n are the obvious
> ones), and while they don't put a smile on my face they do serve a
> very narrow, well defined purpose, in places where it would often be
> impossible to get the appropriate context in.
>
> However, I think making the entire request object (or even just the
> domain + SSL state) is an incredibly open ended solution that is rife
> with potential for abuse.
>
> However, I come bearing a solution!
>
> Instead of having get_url() or whatever the method is named return a
> string, have it return a URL object, specifically instantiated with
> REQUEST_HOST for it's host value.  Then the caller can pass this value
> around and when it gets returned to the appropriate place where the
> request is in scope it can interpolate it's values into the URL object
> appropriately.  This may necessitate adding a template filter ({{
> obj.get_url|interpolate_request:request }}).

How's that different than the current situation, where we return an
absolute URL reference that can be converted into an absolute URL
using request.build_absolute_uri?

Mike

>
> Alex
>
> --
> "I disapprove of what you say, but I will defend to the death your
> right to say it." -- Voltaire
> "The people's good is the highest law." -- Cicero
> "Code can always be simpler than you think, but never as simple as you
> want" -- Me
>
> --
>
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To post to this group, send email to django-develop...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-developers+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-developers?hl=en.
>
>
>

--

You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.




Re: What do people think about the get_absolute_url proposal?

2009-12-16 Thread Mike Malone
> The way i see it (which may be wrong), this is not a proposal to make
> the request object global or replace/refactor the contrib.site app. In
> fact, some of the use cases mentioned strike me as things that would
> require overriding the default get_absolute_url method anyway. It
> seems to me like everyone is arguing over things outside the scope of
> this proposal.

Actually the fundamental disagreement (as far as I can tell) is over
whether the absolute URL should be built using information pulled from
various application settings (Site module, settings file, etc) or
information pulled from the currently active request.

In my opinion, using the Site module and settings files is damn
annoying. I never use the Site module, and in my experience having to
change the "FRONTEND_URL" of your app every time you push to a
different environment is tedious and a frequent source of subtle
problems. Moreover, the request information in your current request
_should_ always be correct. If someone requests a non-canonical URL
you should redirect (CNAME, 301, etc) to the canonical URL. If your
load balancer isn't sending the correct headers then the load balancer
is broken, not Django.

That said, it sounds like there are a number of special cases where it
would be useful to override these settings. So maybe the best corse of
action is to try to use the configured Site information and fall back
on "RequestSite", which uses information from the currently active
request.

Thoughts?

Mike

--

You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.




Re: What do people think about the get_absolute_url proposal?

2009-12-14 Thread Mike Malone
On Sat, Dec 12, 2009 at 3:58 PM, Ivan Sagalaev
<man...@softwaremaniacs.org> wrote:
> Mike Malone wrote:
>> On Tue, Dec 8, 2009 at 7:52 PM, Russell Keith-Magee
>> <freakboy3...@gmail.com> wrote:
>>>  4. I share Mike's concern about using settings.SITE_ID to determine
>>> the current host, but I'm not sure I have any suggestions on how we
>>> could practically use request, short of encouraging the use of a
>>> template tag like {% obj_url %} that can use data from the request
>>> context if it is available.
>>
>> It's not exactly pretty, but: http://paste.pocoo.org/show/155827/
>
> All these are variations of the same thing: making request object
> global. Do we really want it?
>
> For one thing it breaks what Django has always got right: being able to
> work in a script outside of a web request loop. So relying on
> contrib.Site may be inconvenient but it's way better than a global
> request. What I take from code like this:

Well, not really. It's making a way to generate a URL based on the
request object global. I agree that it's not ideal, but it's not the
same as just making the request object global. It'd be much easier to
patch around this, for example, than it would be to work around a
global request object.

You wouldn't have any trouble in a standalone script unless you tried
to call the get_absolute_url() function.

>     protocol = getattr(settings, "PROTOCOL", "http")
>     domain = Site.objects.get_current().domain
>     port = getattr(settings, "PORT", "")
>
> ... is not that we should get all these from a request but that we
> should add "protocol" and "port" fields to the Site model.

I will reiterate that, in practice, this is a huge pain in the ass.

Sucks that we have to choose the lesser of two evils. Maybe we could
make a RequestSite object that does it the global way and then
get_absolute_url can try to use that and fall back on Site (or vice
versa?) or something? Not sure if that's the best of both worlds or
the worst...

Mike

--

You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.




Re: What do people think about the get_absolute_url proposal?

2009-12-09 Thread Mike Malone
On Tue, Dec 8, 2009 at 7:52 PM, Russell Keith-Magee
 wrote:
>  4. I share Mike's concern about using settings.SITE_ID to determine
> the current host, but I'm not sure I have any suggestions on how we
> could practically use request, short of encouraging the use of a
> template tag like {% obj_url %} that can use data from the request
> context if it is available.

It's not exactly pretty, but: http://paste.pocoo.org/show/155827/

--

You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.




Re: What do people think about the get_absolute_url proposal?

2009-12-08 Thread Mike Malone
I'd much rather have this information come from the current request
vs. coming from settings. Relying on the Site is particularly
annoying. I like the implementation of build_absolute_uri() in
django.http.HttpRequest. The hard part is getting the request object
to a place where it's usable by models without introduction global
state.

Mike

On Mon, Dec 7, 2009 at 2:43 PM, Simon Willison  wrote:
> This made it to the 1.2 feature list:
>
> http://code.djangoproject.com/wiki/ReplacingGetAbsoluteUrl
>
> If we want this in 1.2, it could be as simple as merging the get_url /
> get_url_path methods in to the base Model class, rolling a few unit
> tests and writing some documentation. It feels like we should discuss
> it a bit first though - the proposal hasn't really seen much
> discussion since it was originally put together back in September last
> year.
>
> Cheers,
>
> Simon
>
> --
>
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To post to this group, send email to django-develop...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-developers+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-developers?hl=en.
>
>
>

--

You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.




Re: Template Caching - Ticket #6262

2009-11-16 Thread Mike Malone
K, I just uploaded another patch to ticket #6262. Comments inline.

On Mon, Nov 16, 2009 at 5:58 AM, Russell Keith-Magee
<freakboy3...@gmail.com> wrote:
> On Thu, Nov 12, 2009 at 9:15 AM, Mike Malone <mjmal...@gmail.com> wrote:
>> Sup,
>>
>> I've been working on template caching (ticket #6262) with a mister
>> Alex Gaynor. The patch is getting pretty stable and I think it's close
>> to merge-able, so if anyone wants to take a look at what we've got and
>> provide feedback: go!
>>
>> Interesting background reading for people who haven't been part of
>> this conversation:
>> http://groups.google.com/group/django-developers/browse_thread/thread/b289b871285b86f5/b97ba9e2e9b9ad86
>>
>> Ticket: http://code.djangoproject.com/ticket/6262
>>
>> Latest patch: 
>> http://code.djangoproject.com/attachment/ticket/6262/cache_templates.5.diff
>
> Hi Mike,
>
> Here are my review comments. On the whole, I like what I see. These
> are pretty much all fairly minor bugs, documentation, or cosmetic
> interface changes.
>
>  * In the process of reversing the direction of the stack in Context,
> the get() method has been neglected - it's still using the old stack
> direction.

Good catch. Fixed in the new patch. I also added a test since this
functionality didn't seem to be covered.

>  * The '.loader' extension on TEMPLATE_LOADER entries is consistent
> with the old TEMPLATE_LOADER settings (i.e., pointing at a specific
> instance/method), but not with the other pluggable backend APIs that
> we have.
>
> For example, when you specify the caching and mail backends, you
> provide the name of the module, and it is assumed that the backend
> module will contain a CacheClass and EmailBackend class, respectively.
> For the sake of consistency and clarity, I'd rather see the 'loader'
> name suffix as the implied (and required) name for the object in the
> loader module, rather than needing to explicitly specify the name of
> the loader instance.
>
> Obviously, the legacy support for the get_template_loader function
> will need to be an exception here, but moving forward, we should be
> aiming at a clean API.
>
>  * On the subject of specifying loaders - in order to use the cached
> loader, we need to import the cached loader, and have support for
> callable loaders in find_template_loader. Requiring imports in the
> settings file seems like a bit of a wart to me.
>
> Here's an alternate proposal - rather than allowing callables, how
> about allowing entries in TEMPLATE_LADERS to be tuples, and
> interpreting the [1:] elements of the tuple as the arguments to the
> loader - for example:
>
> TEMPLATE_LOADERS = (
>    ('django.template.loaders.cached', (
>            'django.template.loaders.filesystem.loader',
>            'django.template.loaders.app_directories.loader',
>        )
>    )
> )
>
> Theoretically, this means we could do away with the TEMPLATE_DIRS
> setting, since we could specify template directories in the same
> fashion. In practice, this probably isn't worth doing, but it is worth
> pointing out as a possibility.
>
> This also means we could get away from needing a module-level
> instantiation of Loader() objects - you just look in the module for
> the Loader class, and find_template_loader instantiates it with the
> appropriate arguments (or no arguments if the loader is specified as a
> string, rather than a tuple)

Interesting idea. I had mixed feelings about the import. I'd rather
have this than a new setting.

The only problem here is that it's a bit tricky to tell whether you're
using the old style (a string specifying a callable) or the new style
(a string specifying a module that has a 'Loader' class). I've
implemented this in the new patch, and it seems to work, but it could
probably use a code review. If we're all in agreement I'll go ahead
and update the docs to reflect this new style.

>  * As someone who will need to maintain this documentation over time,
> I don't want to have to keep abreast of changes in other template
> languages to ensure that Django's documentation is accurate. I have no
> problems with mentioning Jinja and Cheetah as other languages that
> could be supported in theory, but I'd rather not give the specific
> implementation example.

Yea, I wasn't sure about including that example. I wrote it up as a
proof of concept, but it probably belongs somewhere as a blog post
instead of being in the official docs.

>  * Also on documentation - the load_template_source methods should be
> mentioned in internals/deprecation.txt, and the cached templates
> feature should noted in the 1.2 release notes.
>
>  * The load_template_source methods should raise a PendingDeprecationWarning

Included i

Template Caching - Ticket #6262

2009-11-11 Thread Mike Malone
Sup,

I've been working on template caching (ticket #6262) with a mister
Alex Gaynor. The patch is getting pretty stable and I think it's close
to merge-able, so if anyone wants to take a look at what we've got and
provide feedback: go!

Interesting background reading for people who haven't been part of
this conversation:
http://groups.google.com/group/django-developers/browse_thread/thread/b289b871285b86f5/b97ba9e2e9b9ad86

Ticket: http://code.djangoproject.com/ticket/6262

Latest patch: 
http://code.djangoproject.com/attachment/ticket/6262/cache_templates.5.diff

The more code reviewers the better, but if you don't have time to read
through the nitty gritty here's a high level overview of the changes:

1. The workhorse django.template.loader.get_template() function will
now return compiled Template objects directly if a template loader
returns one (where a Template object is defined as something with a
render() method).
2. A RenderContext has been added to Context instances. This is
necessary so we can give template.Node instances a thread-safe place
to store state between calls to Node.render().
3. The built-in template tags were updated to use the render context
(specifically, CycleNode, BlockNode, and ExtendsNode).
4. A caching loader has been added to the set of default loaders. To
use it, you instantiate the loader, passing a list of other loaders
that it should wrap. The first time you ask the loader for a template
it'll go through the wrapped loaders and find it, then cache it in
memory. Subsequent requests for the same template are served from
cache. Hazzah!

The patch is pretty complete, it includes tests and docs, so please
take a look! I stuck the updated docs up on my website temporarily, so
if you want to take a look at them in a slightly more readable format
you can check out these URLs:
  http://immike.net/django-docs/ref/templates/api.html#loader-types
  
http://immike.net/django-docs/howto/custom-template-tags.html#template-tag-thread-safety
  
http://immike.net/django-docs/ref/templates/api.html#using-an-alternative-template-language

Thanks,

Mike

--

You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=.




Re: Integrating Django with Tornado's web server

2009-09-14 Thread Mike Malone

> I just checked in change to Tornado that enables you to run any WSGI-
> compatible framework on Tornado's HTTP server so that Django apps
> could run on top of Tornado's HTTP server and benefit from some of the
> performance work we have done. (I just sent a message to django-users@
> with getting started instructions as well, but if you are interested,
> take a look at 
> http://github.com/facebook/tornado/blob/master/tornado/wsgi.py#L188).

Great news, there was a lot of discussion at DjangoCon last week after
the Tornado launch about this possibility. There was a bit of hacking
on it during the Django sprints, but things seemed to stall as people
realized there were a few incompatibilities due to Tornado munging
some of the HTTP headers (not sure the details, as I wasn't working on
this stuff personally). Awesome to hear you made it work!

> I chose the WSGI approach because it is generic and applies to all
> frameworks, but Django is obviously the most widely used. I am curious
> if there is any benefit to implementing more "native" support in
> django.core.handlers or if WSGI is the preferred way of adding support
> for new servers. If there is any performance or usability benefit, let
> me know, because we would be happy to contribute our time to make it
> happen.

It would be interesting if you could formalize your issues with WSGI
(it's not a horrible solution, but there are certainly places where it
could be improved). Armin Ronacher was talking about updating the WSGI
PEP at DjangoCon and was soliciting ideas, so you may want to get in
touch with him and offer some suggestions for making WSGI work with
high performance containers like Tornado.

Mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Template Caching

2009-08-06 Thread Mike Malone

> On Thu, Aug 6, 2009 at 2:55 AM, Russell
> Keith-Magee wrote:
>> As is noted in the ticket, one of the reasons that this wasn't done
>> originally was that the performance boost wasn't seen as being that
>> considerable.
>
> I suspect there'll be a goodly speedup even for the common case, since
> what caching basically avoids here is the IO requirements of going to
> the disk. Processors have gotten lots more powerful over the last five
> years, but disk IO is just as slow.

I haven't done very extensive testing on projects other than my own,
but at the very least these changes shouldn't slow anything down. What
I'd really love to see, actually, is some empirical data from people
testing these changes on their own apps. I was really surprised at the
performance gains we saw, so I'd be interested in hearing if simpler
apps see a noticeable improvement as well.

> Finally, like Russ, I'm worried about the effect this will have on
> existing template tags. Auditing code I have lying around, I see at
> least a half-dozen tags that store state on self. I think it's even
> figured into some docs and books. So figuring out *some* way of at the
> very least easing that transition would help the pill go down quite a
> bit.

Yes, I agree that accommodating legacy tags is very important. Leaving
caching off by default should make things easier on people -- they can
choose to upgrade their existing code if and when they decide template
caching is worthwhile. The addition of the "parser context" should
also help. This feature should be well documented along with the risks
associated with storing state on self (note that instances can store
_some_ state if it doesn't change, like variable names and other
arguments).

The only dangerous things I found in the built-in template tags were
block, extends, and cycle. Fixing block & extends was tough (there was
some prior work here which helped a lot, so thanks to the folks who
worked on that). But cycle was a pretty trivial change (two or three
lines), and I suspect it's a more typical tag -- with the "parser
context" available I think most tags can be made cache/thread-safe
fairly easily.

I thought about adding a ``data`` dictionary to template tags with
each template render. This dictionary could be used in ``render`` to
store state (so you could just stick stuff in self.data[key] and
self.data would be set to something like context.parser_context[self])
but I decided that was a bit too magical and wasn't really a big
improvement over explicitly using ``context.parser_context``.

Russ, I totally agree that we need to be careful not to screw up the
template tag stuff. The existing template-tag API is unchanged -- in
fact there's no need to change anything at all unless you enable
caching. The bit that probably needs the most attention & review are
the block/extends changes. I spent a lot of time deciphering that code
and I'm fairly confident that the patch duplicates the existing
functionality, but it was rather complicated code and it's important
we don't mess something up that's used by pretty much every Django app
in the wild ;).

Mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Template Caching

2009-08-05 Thread Mike Malone
Hey everyone,


I've been working on a patch for Django that would allow you to optionally
cache templates after they've been lexed and parsed (compiled) by the
template engine. I've got things far enough along that I have a working
implementation, so I thought I'd share here and see if anyone had any
thoughts / comments. I've attached my diff to ticket #6262, so visit
http://code.djangoproject.com/ticket/6262 to check it out.


I'd like to add template caching as a feature to Django 1.2, and am willing
to do whatever is necessary to make this happen.


If you'd rather read prose, here's a bit of an explanation / justification
for the changes:


Why?


At the moment, Django is reading templates from disk, lexing them, parsing
them, and then rendering them with the current context _every time_ a
template is rendered. If your blog.html template loops through an array of
15 blog posts and {% includes %} post.html for each of them, post.html is
read from disk (or disk cache), lexed, parsed, and rendered 15 times. All
but the render step is avoidable with a bit of caching.


I've done some rather crude benchmarking and, by my measurements, a template
takes about 1ms to lex and parse. That doesn't seem like much, but it adds
up. I'm working on a project that renders ~400 templates for the index page
(you need to do lots of extends and includes if you're making a reusable app
that you want to let people skin) and with this patch enabled we're seeing
template rendering time decrease by about 350ms. Big win.


How?


The only thing that's absolutely necessary to make cached templates possible
in Django is to add a branch in ``django.template.loader.get_template()``
that checks whether the returned template "source" is actually a compiled
template, in which case it is returned directly and the compilation step is
skipped. Once this is done, template caching can be implemented with a
custom template loader.


By simply checking that the template has a ``render`` method in
``get_template``, we get the added benefit of allowing users to write
loaders that return custom Template instances, or Template instances that
use an alternative template language like Jinja or Mako.


Template-Tag State:


In order to make cached templates usable in practice (and backwards
compatible) some changes need to be made in the template tags as well. In
particular, the block, extends, and cycle template tags store state on
instances. Since template tags are instantiated when the template is parsed,
this state sticks around between template renders. This also means templates
are not thread-safe (a separate, but related problem) -- if two threads
share the same instantiated template and both try to render it problems can
arise.


To make template tags safe for cached templates (and thread-safe) all state
should be stored in the template context. But if we just stick state in the
context dictionary we're polluting the template namespace with irrelevant
stuff that probably shouldn't be exposed there. Therefore, I propose adding
an attribute to Context instances, ``parser_context`` that can be used to
store parser state.


Unlike the Context object, I believe parser context should be statically
scoped per Template render. That is, if the dictionary at the top of the
"stack" doesn't contain the requested key, a KeyError should be raised
immediately rather than walking the stack looking for the key in
dictionaries further up the stack. I think this makes sense since parser
state is generally associated with a single Template, and Templates are
rendered recursively (because of extends and include tags). My
implementation pushes() the parser_context stack at the beginning of each
template render, and pops it after rendering is complete.


Unfortunately, since there are numerous template tags that exist "in the
wild," some work will probably have to be done to port existing Django
projects over to use cached templates. But since template caching will be
implemented using a custom loader, and off by default, users can choose to
enable it or not, so this shouldn't be a problem.


Refactoring Loaders:


At the moment, template loaders are implemented as module-level functions.
This makes them difficult to extend. I suggest refactoring the built-in
template loaders to be classes. By implementing __call__ in the
``BaseLoader`` and instantiating a module-level instance of each loader we
can maintain backwards compatibility.


Cached Template Loader Options:


Once all this groundwork is done, we need to decide how to implement a
caching template loader. There are several options:


1. Don't include one at all. Let users write their own and implement it
however they want.

2. Implement a generic caching loader that can be instantiated with a list
of loaders that it should try to use to load templates, caching the results.
This requires a bit of a change to
``django.template.loader.find_template_source()`` since the current
implementation assumes you're 

Re: Proposal: user-friendly API for multi-database support

2008-09-10 Thread Mike Malone
> On Sep 10, 10:24 pm, "Mike Malone" <[EMAIL PROTECTED]> wrote:
> > At Pownce, for example, we stick users to the master database for some
> > period of time (a couple of seconds, usually) after they post a new note.
> > The problem here (as Malcolm pointed out) is that related managers use
> the
> > default manager for the related field. So if I ask for a User's Notes,
> the
> > default Note manager is used. That manager is, presumably, where the
> > decision is going to be made as to whether the slave or the master should
> be
> > queried. But the Note manager has no way of knowing whether the User is
> > stuck to the master -- it doesn't even know that there's a User
> associated
> > with the query...
>
> That's really interesting. I wonder if that invalidates the whole
> approach I proposed, or merely means it needs some refining?
>

I think it just needs refining. My understanding is that related fields was
due for a refactor anyways, so this would probably be a good time to do /
think about it. I guess my point is that there needs to be some non-internal
API for getting at related field information, too. In any case, more thought
is required.

Mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Proposal: user-friendly API for multi-database support

2008-09-10 Thread Mike Malone
> Well... To be sure save() should always go to master because on slaves
> you just don't have permissions to save anything. So a parameter to
> save() is redundant.
>

Not so. There are certainly use-cases for more sophisticated database
architectures where, for example, the majority of the database tables are
written to the master and replicated to all slaves, while a couple of
write-heavy tables are sharded and written directly to individual slaves.
More common is a master-master replication strategy, where a particular User
(for example) is stuck to one of a pair of database servers that replicate
one another. In this case you'd want to be able to specify somehow which
server to save() to.

Mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Proposal: user-friendly API for multi-database support

2008-09-10 Thread Mike Malone
Wow... like Malcom said, lots to digest here.

So to start, the "simple" master-slave replication scenario turns out not to
be so simple once you get into the implementation details. Replication lag
being what it is, you almost never way to query the slave for every SELECT.

At Pownce, for example, we stick users to the master database for some
period of time (a couple of seconds, usually) after they post a new note.
The problem here (as Malcolm pointed out) is that related managers use the
default manager for the related field. So if I ask for a User's Notes, the
default Note manager is used. That manager is, presumably, where the
decision is going to be made as to whether the slave or the master should be
queried. But the Note manager has no way of knowing whether the User is
stuck to the master -- it doesn't even know that there's a User associated
with the query...

We've solved this by poking at a lot of the related fields internals.
Malcolm helped a lot, and he's probably one of the only people who could
have made it happen. It's not that much code, but it relies heavily on
internal API and is certainly not something that should be recommended.

Simon, from your first email it seems you're suggesting that the Manager
call Query.as_sql() and then parse the resulting SQL string? That seems like
it's going to encourage a lot of hacky/fragile solutions. IMO, the right
place for a decision like "should this User's notes come from the master, or
the slave?" is on the User model (or maybe User manager), not in the Note
manager.

The same problem comes up with sharding. Suppose, for example, Pownce
started sharding by User and putting each User's Notes on the same server
the User is on. We should be able to call User.notes.all() and get that
User's notes, but the Note manager can't easily tell what server it should
be querying, since it doesn't know about the User. Again, you could start
poking at the internals of Query and try to figure out what's going on, but
that doesn't seem like a particularly elegant solution...

Mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Call for testing: streaming uploads (#2070)

2008-06-26 Thread Mike Malone
Hey Jacob,

FYI: Our environment isn't that bizarre (Apache/Debian), but we've been
running patch #2070 in production on Pownce for a couple weeks now (we
actually ran a back-ported version of #2070 on 0.96 before we moved to
trunk). It's been working beautifully, and has really improved performance
for large file uploads. Our current cap is 250MB, but I've tested uploads
that are closer to a gig in our staging environment with no problems.

Mike

On Thu, Jun 26, 2008 at 12:41 PM, Marty Alchin <[EMAIL PROTECTED]>
wrote:

>
> On Thu, Jun 26, 2008 at 3:14 PM, Jacob Kaplan-Moss
> <[EMAIL PROTECTED]> wrote:
> >
> > Hi folks --
> >
> > As far as I'm concerned, #2070, adding large streaming uploads, is
> > done. I'd like to get some public kicking-of-the-tires before I push
> > the change to trunk (which won't happen before Tuesday: I'm taking a
> > long weekend off).
>
> Yay!
>
> I can't help too much with most of your needs, except possibly coming
> up with a custom upload filter just to figure out how it works.
> Instead, I'll work on getting the file storage patch to play nicely
> with it, so that once it's merged, we're already a step ahead on
> another item in the list.
>
> I don't expect it to take much work, but I've been holding off until
> 2070 either merged or hit a stage like this, so I wasn't working with
> too much of a moving target. I'll stay in the loop on any changes that
> are made as a result of this testing, and get us in better shape once
> it lands.
>
> -Gul
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---