Re: Cython usage within Django

2017-05-21 Thread Curtis Maloney

On 22/05/17 09:30, Tom Forbes wrote:

Hey Karl and Russell,
Thank you for your quick replies!

Russell:

You are of course right about the speed of the Python code itself being
a bottleneck and the usual suspects like the database are more important
to optimize. However I don't think it's necessarily always a waste of
time to explore optimizing some Python/django functions, if that only
means simply moving them to a separate file and running Cython on them
to get a speed boost.


Of course, I'm sure Russel won't object to be shown to be wrong, if you 
feel like giving it a go anyway :)


--
Curtis

--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/50d4b757-ca93-66b6-c518-7ab0793e7646%40tinbrain.net.
For more options, visit https://groups.google.com/d/optout.


Re: PostgreSQL aggregation and views through unmanaged models

2017-05-21 Thread Josh Smeaton
> Therefore I'd favor we keep the current adjustment in the master branch 
as it
> restores backward compatibility but I don't have strong feelings about 
reverting
> it either if it's deemed inappropriate.

Fixing the crash is the number 1 priority in my opinion, as it broke 
something that used to work.

Optimising aggregation for unmanaged models is a distant second goal. Make 
it right. Then make it fast.

It'd be nice to provide said optimisation for unmanaged models provided 
there was a palatable way of doing so. I'm not sure how users would be able 
to use your callable approach without subclassing the backend - unless that 
is the intention? Getting feedback from the reporters would be good of 
course.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/16e9386f-856f-4730-85a1-69ff1545a04c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


PostgreSQL aggregation and views through unmanaged models

2017-05-21 Thread charettes
Hello fellow developers,

As some of you may know PostgreSQL 9.1 added support for GROUP'ing BY
selected table primary keys[0] only. Five years ago it was reported[1] that
Django could rely on this feature to speed up aggregation on models backed
up by tables with either many fields or a few large ones.

Being affected by this slow down myself I decided to dive into the ORM 
internals
and managed to get a patch that made it in 1.9[2] thanks to Anssi's and 
Josh's
review[3].

One subtle thing I didn't know back in the time is that PostgreSQL query 
planner
isn't able to introspect database views columns' functional dependency like 
it
does with tables and thus prevents the primary key GROUP'ing optimization 
from
being used.

While Django doesn't support database views officially it documents that
unmanaged models can be used to query them[4] and thereby perform 
aggregation on
them and generating an invalid query.

This was initially reported as a crashing bug 9 months ago[5] and the 
consensus
at this time was that it was an esoteric edge case since there was few 
reports
of breakages and it went off my radar. Fast-forward to a month ago, this is
reported again[6] and it takes the reporter quite a lot of effort to 
determine
the origin of the issue, pushing me to come up with a solution as I 
introduced
this behavior.

Before Claude makes me realize this is a duplicate of the former report 
(which I
completely forgot about in the mean time) I implement a patch and commit it 
once
it's reviewed [7].

When I closed the initial ticket as "fixed" the reporter brought to my 
attention
that this was now introducing a performance regression for unmanaged models
relying on aggregation and that we should document how to disable this
optimization by creating a backend subclass as a workaround instead.

In my opinion the current situation is as follow. The optimization 
introduced a
break in backward compatibility in 1.9 as we've always documented that 
database
views could be queried against using unmanaged models. If this issue had 
been
discovered during the 1.9 release cycle it would have been eligible for a
backport because it was a bug in a newly introduced feature. Turning this
optimization off for unmanaged models by assuming they could be views is 
only
going to degrade performance of queries using unmanaged models to perform
aggregation on tables with either a large number of columns or large columns
using PostgreSQL.

Therefore I'd favor we keep the current adjustment in the master branch as 
it
restores backward compatibility but I don't have strong feelings about 
reverting
it either if it's deemed inappropriate.

Another solution I came up while writing this post would be to replace the
feature flag by a callable that takes a model as a single parameter and 
returns
whether or not the optimization can be performed against it. The default
implementation would return `mode._meta.managed` but it would make it 
easier for
users affected by this to override in order to opt-in or out based on their
application logic.

Thank you for your time,
Simon

[0] https://www.postgresql.org/docs/9.1/static/sql-select.html#SQL-GROUPBY
[1] https://code.djangoproject.com/ticket/19259
[2] 
https://github.com/django/django/commit/dc27f3ee0c3eb9bb17d6cb764788eeaf73a371d7
[3] https://github.com/django/django/pull/4397
[4] https://docs.djangoproject.com/en/1.11/ref/models/options/#managed
[5] https://code.djangoproject.com/ticket/27241
[6] https://code.djangoproject.com/ticket/28107
[7] 
https://github.com/django/django/commit/daf2bd3efe53cbfc1c9fd00222b8315708023792

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/cf767186-8082-4553-ba85-2547169d5b53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cython usage within Django

2017-05-21 Thread Tom Forbes
Hey Karl and Russell,
Thank you for your quick replies!

Karl:
Agreed, duplicate CI runs would have to be performed (which would double
the time or double the number of runners required). As I understand it the
Django project itself would not distribute pre-compiled wheels, the
setup.py Cython 'stuff' would handle this (and fail gracefully if anything
goes wrong, like no C compiler being available). The type annotations
sounds interesting and would alleviate a fair bit of engineering effort,
keeping two duplicate copies in sync sounded horrible.

Russell:
The point release issues certainly sound troubling and almost make me want
to give up based on that alone. Can you elaborate on them at all - where
you doing anything particularly crazy or complex with your cythonified code?

You are of course right about the speed of the Python code itself being a
bottleneck and the usual suspects like the database are more important to
optimize. However I don't think it's necessarily always a waste of time to
explore optimizing some Python/django functions, if that only means simply
moving them to a separate file and running Cython on them to get a speed
boost.

That's the dream at least, but it's rarely that simple in practice. After
perusing the Django code for some functions that look like they could
benefit from Cython it seems a lot are tightly coupled and could not be
extracted without a bit of effort. Plus the engineering/ci/release overhead
would be considerable.

So, perhaps it seems this is just a pipe dream and not worth the effort.
Thanks for replying anyway!

Tom

On 21 May 2017 23:36, "Russell Keith-Magee"  wrote:

Hi Tom,

My immediate reaction is No, for three reasons:

1. My experience has been that Cython isn’t especially stable.  Admittedly,
I haven’t looked at it for a couple of years, but when I did, I ended up
getting caught in some really nasty bugs that came back and forth between
micro versions.

2. Even if Cython *was* stable: The execution speed of your Django stack is
almost certainly *not* the bottleneck of your application. Query time,
database transfer time, and just basic client-end connection latency will,
for most applications, be a *much* bigger performance problem than the
execution time of the Python stack.

3. Even if the Django code in your app *was* your bottleneck, switching to
PyPy as your interpreter will almost certainly give you better performance
for less engineering effort.

If you want to do some experimentation, by all means go right ahead;
however, I would caution you that any patch you produce will need to
demonstrate a *significant* improvement in real-world use cases for us to
adopt the engineering overhead of integrating Cython into Django’s runtime
environment.

Yours
Russ Magee %-)

On 21 May 2017, 2:59 PM -0700, Tom Forbes , wrote:

Hello,
There was a very interesting talk at Pycon about using Cython to speed up
hotspots in Python programs:

https://www.youtube.com/watch?v=_1MSX7V28Po

It got me wondering about possibly using Cython in selected places within
Django. I realize since Django was first released the distribution
situation was a bit more wild-west, resulting in part to Django not relying
on any third party dependencies. But that situation is rapidly changing
(see https://github.com/django/deps/blob/master/draft/0007-depend
ency-policy.rst#background-and-motivation) and with these changes could it
also be a time to investigate Cython usage for select parts of Django?

Several popular projects use Cython 'speedup' modules with pure-python
fallbacks with great success, for example aiohttp (
https://github.com/aio-libs/aiohttp/blob/master/setup.py#L20). I did some
quick and dirty profiling of the 'django.utils.html.escape' function and
found that by simply including Cython as part of the build, and with no
syntax changes, the function executes twice as fast.

There are lots of considerations to take into account (like ensuring the
Cython functions are in sync with the fallback ones), but it seems that it
could make a big difference with small, self contained functions (like
html.escape or html.escapejs) that are executed frequently as part of a
request. Other functions that might be worth looking at include
core.http.mutliparser.parse_header or utils.baseconv.BaseConverter.convert.

My question is: this this something that's worth exploring, or is it
outside of the realms of possibility?
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/ms

Re: Cython usage within Django

2017-05-21 Thread Russell Keith-Magee
Hi Tom,

My immediate reaction is No, for three reasons:

1. My experience has been that Cython isn’t especially stable.  Admittedly, I 
haven’t looked at it for a couple of years, but when I did, I ended up getting 
caught in some really nasty bugs that came back and forth between micro 
versions.

2. Even if Cython *was* stable: The execution speed of your Django stack is 
almost certainly *not* the bottleneck of your application. Query time, database 
transfer time, and just basic client-end connection latency will, for most 
applications, be a *much* bigger performance problem than the execution time of 
the Python stack.

3. Even if the Django code in your app *was* your bottleneck, switching to PyPy 
as your interpreter will almost certainly give you better performance for less 
engineering effort.

If you want to do some experimentation, by all means go right ahead; however, I 
would caution you that any patch you produce will need to demonstrate a 
*significant* improvement in real-world use cases for us to adopt the 
engineering overhead of integrating Cython into Django’s runtime environment.

Yours
Russ Magee %-)

On 21 May 2017, 2:59 PM -0700, Tom Forbes , wrote:
> Hello,
> There was a very interesting talk at Pycon about using Cython to speed up 
> hotspots in Python programs:
>
> https://www.youtube.com/watch?v=_1MSX7V28Po
>
> It got me wondering about possibly using Cython in selected places within 
> Django. I realize since Django was first released the distribution situation 
> was a bit more wild-west, resulting in part to Django not relying on any 
> third party dependencies. But that situation is rapidly changing (see 
> https://github.com/django/deps/blob/master/draft/0007-dependency-policy.rst#background-and-motivation)
>  and with these changes could it also be a time to investigate Cython usage 
> for select parts of Django?
>
> Several popular projects use Cython 'speedup' modules with pure-python 
> fallbacks with great success, for example aiohttp 
> (https://github.com/aio-libs/aiohttp/blob/master/setup.py#L20). I did some 
> quick and dirty profiling of the 'django.utils.html.escape' function and 
> found that by simply including Cython as part of the build, and with no 
> syntax changes, the function executes twice as fast.
>
> There are lots of considerations to take into account (like ensuring the 
> Cython functions are in sync with the fallback ones), but it seems that it 
> could make a big difference with small, self contained functions (like 
> html.escape or html.escapejs) that are executed frequently as part of a 
> request. Other functions that might be worth looking at include 
> core.http.mutliparser.parse_header or utils.baseconv.BaseConverter.convert.
>
> My question is: this this something that's worth exploring, or is it outside 
> of the realms of possibility?
> --
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/CAFNZOJOsAdL422Ntj4cUkYF1bjqUBdMAXp33xZ%3DapSwqXMasvA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/0c83b001-6654-4c6d-a134-e5dd7a6b8dc8%40Spark.
For more options, visit https://groups.google.com/d/optout.


Re: Cython usage within Django

2017-05-21 Thread 'Carl Meyer' via Django developers (Contributions to Django itself)
On 05/21/2017 02:59 PM, Tom Forbes wrote:
> There are lots of considerations to take into account (like ensuring the
> Cython functions are in sync with the fallback ones)

It's possible to only have one version of the code, using only Python
syntax, and conditionally compile it with Cython if available. This
gives up some potential efficiency wins from type annotation, but avoids
the need to keep two copies in sync.

Regardless, though, I think CI would need to run the tests both with
Cython and with non-Cython fallback.

We've moved toward releasing wheels instead of sdist on PyPI for recent
versions; for this to be useful it would mean releasing multiple binary
wheels for different platforms.

There's no question this could make a big difference to Django CPU
usage; the question is whether it's worth the added CI and release
complexity when it would likely provide little value to the majority of
Django users.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/ff2d35c1-e384-e010-b57f-d249df3340b2%40instagram.com.
For more options, visit https://groups.google.com/d/optout.


Cython usage within Django

2017-05-21 Thread Tom Forbes
Hello,
There was a very interesting talk at Pycon about using Cython to speed up
hotspots in Python programs:

https://www.youtube.com/watch?v=_1MSX7V28Po

It got me wondering about possibly using Cython in selected places within
Django. I realize since Django was first released the distribution
situation was a bit more wild-west, resulting in part to Django not relying
on any third party dependencies. But that situation is rapidly changing
(see
https://github.com/django/deps/blob/master/draft/0007-dependency-policy.rst#background-and-motivation)
and with these changes could it also be a time to investigate Cython usage
for select parts of Django?

Several popular projects use Cython 'speedup' modules with pure-python
fallbacks with great success, for example aiohttp (
https://github.com/aio-libs/aiohttp/blob/master/setup.py#L20). I did some
quick and dirty profiling of the 'django.utils.html.escape' function and
found that by simply including Cython as part of the build, and with no
syntax changes, the function executes twice as fast.

There are lots of considerations to take into account (like ensuring the
Cython functions are in sync with the fallback ones), but it seems that it
could make a big difference with small, self contained functions (like
html.escape or html.escapejs) that are executed frequently as part of a
request. Other functions that might be worth looking at include
core.http.mutliparser.parse_header or utils.baseconv.BaseConverter.convert.

My question is: this this something that's worth exploring, or is it
outside of the realms of possibility?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFNZOJOsAdL422Ntj4cUkYF1bjqUBdMAXp33xZ%3DapSwqXMasvA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Missing support for gettext fallback translations

2017-05-21 Thread Patryk Zawadzki
W dniu niedziela, 21 maja 2017 01:35:28 UTC+2 użytkownik Ramiro Morales 
napisał:
>
> I'm also surprised by your findings. I guess it's something we simply took 
> for granted. It's mentioned in the [1]docs and has been so for [2]years.
>

It's not the same case though. Docs say that if `de-at` is not available, 
Django will use `de` and that's the case. What does not work is the case 
where `de-at` exists but is a sparse catalog of only translation 
differences specific to `de-at`. In that situation translations missing 
from `de-at` should be filled in by falling back to the generic `de` 
catalog but Django doesn't support it.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/257e7cd2-8866-452c-b5b8-0e4e8d41306b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Review of DEP 201 - simplified routing syntax

2017-05-21 Thread Aymeric Augustin
Hello,

The technical board accepted DEP 201: 
https://github.com/django/deps/blob/master/accepted/0201-simplified-routing-syntax.rst
 


Sjoerd has taken the lead on the implementation, please get in touch if you'd 
like to help!

Thanks,

-- 
Aymeric.



> On 12 May 2017, at 14:19, Aymeric Augustin 
>  wrote:
> 
> After getting approval from Tom on IRC, I updated the DEP according to my 
> email below: https://github.com/django/deps/pull/41 
> 
> 
> The next steps are:
> 
> - account for any remaining feedback (I feel I made only minor changes 
> compared to the last version, hopefully we can wrap this up quickly now)
> - get approval from the technical board
> - complete the implementation!
> 
> -- 
> Aymeric.
> 
> 
> 
>> On 12 May 2017, at 12:32, Aymeric Augustin 
>> > > wrote:
>> 
>> Hello,
>> 
>> I reviewed the current version of DEP 201 
>> 
>>  as well as related 
>>  
>> discussions 
>> .
>>  I took notes and wrote down arguments along the way. I'm sharing them 
>> below. It may be useful to add some of these arguments to the DEP.
>> 
>> Sjoerd, Tom, I didn't want to edit your DEP directly, but if you agree with 
>> the items marked [Update DEP] below I can prepare a PR. I will now take a 
>> look at the pull requests implementing this DEP.
>> 
>> 
>> Should it live as a third-party package first?
>> 
>> The original motivation for this DEP was to make Django easier to use by 
>> people who aren't familiar with regexes.
>> 
>> While regexes are a powerful tool, notably for shell scripting, I find it 
>> counter-productive to make them a prerequisite for building a Django 
>> website. You can build a very nice and useful website with models, forms, 
>> templates, and the admin, without ever needing regexes — except, until now, 
>> for the URLconf!
>> 
>> Since we aren't going to say in the official tutorial "hey install this 
>> third-party package to manage your URLs", that goal can only be met by 
>> building the new system into Django.
>> 
>> Besides, I suspect many professional Django developers copy-paste regexes 
>> without a deep understanding of how they work. For example, I'd be surprised 
>> if everyone knew why it's wrong to match a numerical id in a URL with \d+ 
>> (answer at the bottom of this email).
>> 
>> Not only is the new system easier for beginners, but I think it'll also be 
>> adopted by experienced developers to reduce the risk of mistakes in 
>> URLpatterns, which are an inevitable downside of their syntax. Django can 
>> solve problems like avoiding \d+ for everyone.
>> 
>> Anecdote: I keep making hard-to-detect errors in URL regexes. The only URL 
>> regexes I wrote that can't be replicated with the new system are very 
>> dubious and could easily be replaced with a more explicit `if 
>> some_condition(request.path): raise Http404` in the corresponding view. I 
>> will be happy to convert my projects to the new system.
>> 
>> No progress was made in this area since 2008 because URL regexes are a minor 
>> annoyance. After you write them, you never see them again. I think that 
>> explains why no popular alternative emerged until now.
>> 
>> Since there's a significant amount of prior art in other projects, a strong 
>> consensus on DEP 201 being a good approach, and a fairly narrow scope, it 
>> seems reasonable to design the new system directly into Django.
>> 
>> 
>> What alternatives would be possible?
>> 
>> I have some sympathy with the arguments for a pluggable URL resolver system, 
>> similar to what I did for templates in Django 1.8. However I don't see this 
>> happening any time soon because there's too little motivation to justify the 
>> effort. As I explained above, developers tend to live with whatever the 
>> framework provides.
>> 
>> Of course, if someone wants to write a fully pluggable URL resolver, that's 
>> great! But there's no momentum besides saying that "it should be done that 
>> way". Furthermore, adding the new system shouldn't make it more difficult to 
>> move to a fully pluggable system. If anything, it will clean up the 
>> internals and prepare further work in the area. Some changes of this kind 
>> were already committed.
>> 
>> DEP 201 is mostly independent from the problem of allowing multiple views to 
>> match the same URL  — 
>> that is, to resume resolving URL patterns if a view applies some logic and 
>> decides it can't handle a URL. This is perhaps the biggest complaint