Re: Refactoring the autoreloader

2017-12-10 Thread Tom Forbes
Thank you all for your feedback! I have fixed up my PR (
https://github.com/django/django/pull/8819) to add support for
watchman-based reloading, and it seems to be working well locally. After
further investigation I don't think using the Watchdog library is suitable
for Django, it seems to be un-maintained right now (which is a real shame).
Watchman seems to be faster even with constant reloading though, so I think
it's fine to just add support for that.

If anyone is interested they could take the changes to
`django/core/management/commands/runserver.py` and
`django/utils/autoreload.py`, and apply them to their installations. You
just need to install the `pywatchman` package + watchman itself, then run
manage.py with '--watchman'. I'd love some real-world feedback as to if
this improves reload speed and reduces CPU usage. Unfortunately Watchman
won't work on NFS volumes or virtualbox folders, so if you're using Vagrant
there isn't much that will help you here, which is also why we cannot make
this a default option - watchman will appear to work, but not pick up any
changes, if its run on such a volume.

I'll start work on adding documentation and tests next, but I'm quite lost
as to how I should test the Watchman integration. Should I be testing using
a real Watchman service, or should I mock it and hope that's enough?


On Thu, Oct 12, 2017 at 12:42 AM, Josh Smeaton 
wrote:

> I'd just like to second (or third) an appetite for this or a similar
> change. Lately I've started to notice very poor performance (high cpu
> usage, slow shutdowns, etc) with the django dev server on my project. I've
> moved to runserver_plus/werkzeug with watchdog in the mean time but it'd be
> good to have similar improvements via django natively.
>
> On Sunday, 8 October 2017 05:24:31 UTC+11, Aymeric Augustin wrote:
>>
>> Hello Tom,
>>
>> I think the PR is on the right track. Now it needs to be taken beyond the
>> "proof of concept" stage.
>>
>> Once you have code, docs, and if possibly some tests — currently there
>> are few due to the difficulty of testing auto reloading, especially edge
>> cases — for steps 1 and 2, I can review changes again and help you get it
>> merged.
>>
>> Since this is a major overhaul, we'll have to think carefully about how
>> the behavior changes and possible backwards incompatibilities. That said,
>> the autoreloader is a dev tool. Changing it isn't going to break anyone's
>> production. This makes the big-bang approach viable.
>>
>> Thanks for working on this!
>>
>> --
>> Aymeric.
>>
>>
>>
>> On 29 Sep 2017, at 21:03, Tom Forbes  wrote:
>>
>> Hello,
>> I've been thinking on and off about how to improve the autoreloader
>> implementation and I wanted to gather some feedback on potential solutions.
>>
>> For some background, Django uses a fairly basic autoreload implementation
>> that simply polls the last modified time for loaded Python files once a
>> second. While this isn't the most efficient, it does work and has worked
>> quite well for a long time. When running manage.py runserver, the
>> autoreloader will launch a child "manage.py" with the same arguments and
>> this child process actually runs Django and serves requests. To reload, the
>> child process exits with exit code 3 and the parent restarts it. The code
>> is some of the oldest in Django, with a fair bit of it not touched in 9-12
>> years.
>>
>> While it works (and I'm a believer in "if it isn't broke don't fix it")
>> there are some architectural and performance issues with the current code:
>>
>> - Polling every second is not very efficient
>> - Detecting when the child process has exited during startup (i.e problem
>> in settings.py) is problematic and the code is rather nasty
>> - i18n files are 'reloaded' when they change in a rather hacky way
>> (resetting private attributes in another module)
>> - There is limited support for extending the current implementation, and
>> there are cases during development where the parent autoreloader will
>> terminate.
>>
>> I don't want this email to be too long, so I'm going to summarize what I
>> think would be a good approach to tackling these problems.
>>
>> 1. Refactor the current implementation by removing `pyinotify`, redundant
>> python 2 checks and implement a 'file_changed' signal so other parts of
>> Django can react to file changes (i.e the i18n module can handle resetting
>> it's own state).
>> 2. Add support for the "watchdog" library as a replacement for pyinotify.
>> Watchdog implements file system notifications for all major platforms using
>> a fairly simple API, so we can remove polling and have instant reloading.
>> Also support Watchman, a notification Daemon from Facebook.
>> 3. Add support for more advanced features, like proper handing of startup
>> errors and socket sharing.
>>
>> I've got a merge request that implements all three stages as a proof of
>> concept, but I think it's far too much a change to be done at once and

Re: Refactoring the autoreloader

2017-10-11 Thread Josh Smeaton
I'd just like to second (or third) an appetite for this or a similar 
change. Lately I've started to notice very poor performance (high cpu 
usage, slow shutdowns, etc) with the django dev server on my project. I've 
moved to runserver_plus/werkzeug with watchdog in the mean time but it'd be 
good to have similar improvements via django natively.

On Sunday, 8 October 2017 05:24:31 UTC+11, Aymeric Augustin wrote:
>
> Hello Tom,
>
> I think the PR is on the right track. Now it needs to be taken beyond the 
> "proof of concept" stage.
>
> Once you have code, docs, and if possibly some tests — currently there are 
> few due to the difficulty of testing auto reloading, especially edge cases 
> — for steps 1 and 2, I can review changes again and help you get it merged.
>
> Since this is a major overhaul, we'll have to think carefully about how 
> the behavior changes and possible backwards incompatibilities. That said, 
> the autoreloader is a dev tool. Changing it isn't going to break anyone's 
> production. This makes the big-bang approach viable.
>
> Thanks for working on this!
>
> -- 
> Aymeric.
>
>
>
> On 29 Sep 2017, at 21:03, Tom Forbes  
> wrote:
>
> Hello,
> I've been thinking on and off about how to improve the autoreloader 
> implementation and I wanted to gather some feedback on potential solutions.
>
> For some background, Django uses a fairly basic autoreload implementation 
> that simply polls the last modified time for loaded Python files once a 
> second. While this isn't the most efficient, it does work and has worked 
> quite well for a long time. When running manage.py runserver, the 
> autoreloader will launch a child "manage.py" with the same arguments and 
> this child process actually runs Django and serves requests. To reload, the 
> child process exits with exit code 3 and the parent restarts it. The code 
> is some of the oldest in Django, with a fair bit of it not touched in 9-12 
> years.
>
> While it works (and I'm a believer in "if it isn't broke don't fix it") 
> there are some architectural and performance issues with the current code:
>
> - Polling every second is not very efficient
> - Detecting when the child process has exited during startup (i.e problem 
> in settings.py) is problematic and the code is rather nasty
> - i18n files are 'reloaded' when they change in a rather hacky way 
> (resetting private attributes in another module)
> - There is limited support for extending the current implementation, and 
> there are cases during development where the parent autoreloader will 
> terminate.
>
> I don't want this email to be too long, so I'm going to summarize what I 
> think would be a good approach to tackling these problems.
>
> 1. Refactor the current implementation by removing `pyinotify`, redundant 
> python 2 checks and implement a 'file_changed' signal so other parts of 
> Django can react to file changes (i.e the i18n module can handle resetting 
> it's own state).
> 2. Add support for the "watchdog" library as a replacement for pyinotify. 
> Watchdog implements file system notifications for all major platforms using 
> a fairly simple API, so we can remove polling and have instant reloading. 
> Also support Watchman, a notification Daemon from Facebook.
> 3. Add support for more advanced features, like proper handing of startup 
> errors and socket sharing.
>
> I've got a merge request that implements all three stages as a proof of 
> concept, but I think it's far too much a change to be done at once and 
> should be done carefully stage by stage. One and two are fairly simple to 
> implement, but three requires see careful consideration as to the best 
> approach (this message is long enough already, I don't want to describe 
> them here).
>
> Does anyone have any feedback on these ideas? Is it worth persuing even if 
> the current implementation works ok-ish?
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-develop...@googlegroups.com .
> To post to this group, send email to django-d...@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/CAFNZOJMT9qDk-4pKKXSJysEQCmd6CGxMZBYZs_7BQs_WbAqL6g%40mail.gmail.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to 

Re: Refactoring the autoreloader

2017-10-08 Thread Tim Allen
If we need some dogfooding with real-world candidates, I'd be happy to 
provide some. We have a project which currently takes ~20 minutes to start 
up (automated data models created by introspecting a database) that might 
be a good edge-case candidate.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/33edd13f-7867-4a0c-b834-fbee47e2f5b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Refactoring the autoreloader

2017-10-07 Thread Aymeric Augustin
Hello Tom,

I think the PR is on the right track. Now it needs to be taken beyond the 
"proof of concept" stage.

Once you have code, docs, and if possibly some tests — currently there are few 
due to the difficulty of testing auto reloading, especially edge cases — for 
steps 1 and 2, I can review changes again and help you get it merged.

Since this is a major overhaul, we'll have to think carefully about how the 
behavior changes and possible backwards incompatibilities. That said, the 
autoreloader is a dev tool. Changing it isn't going to break anyone's 
production. This makes the big-bang approach viable.

Thanks for working on this!

-- 
Aymeric.



> On 29 Sep 2017, at 21:03, Tom Forbes  wrote:
> 
> Hello,
> I've been thinking on and off about how to improve the autoreloader 
> implementation and I wanted to gather some feedback on potential solutions.
> 
> For some background, Django uses a fairly basic autoreload implementation 
> that simply polls the last modified time for loaded Python files once a 
> second. While this isn't the most efficient, it does work and has worked 
> quite well for a long time. When running manage.py runserver, the 
> autoreloader will launch a child "manage.py" with the same arguments and this 
> child process actually runs Django and serves requests. To reload, the child 
> process exits with exit code 3 and the parent restarts it. The code is some 
> of the oldest in Django, with a fair bit of it not touched in 9-12 years.
> 
> While it works (and I'm a believer in "if it isn't broke don't fix it") there 
> are some architectural and performance issues with the current code:
> 
> - Polling every second is not very efficient
> - Detecting when the child process has exited during startup (i.e problem in 
> settings.py) is problematic and the code is rather nasty
> - i18n files are 'reloaded' when they change in a rather hacky way (resetting 
> private attributes in another module)
> - There is limited support for extending the current implementation, and 
> there are cases during development where the parent autoreloader will 
> terminate.
> 
> I don't want this email to be too long, so I'm going to summarize what I 
> think would be a good approach to tackling these problems.
> 
> 1. Refactor the current implementation by removing `pyinotify`, redundant 
> python 2 checks and implement a 'file_changed' signal so other parts of 
> Django can react to file changes (i.e the i18n module can handle resetting 
> it's own state).
> 2. Add support for the "watchdog" library as a replacement for pyinotify. 
> Watchdog implements file system notifications for all major platforms using a 
> fairly simple API, so we can remove polling and have instant reloading. Also 
> support Watchman, a notification Daemon from Facebook.
> 3. Add support for more advanced features, like proper handing of startup 
> errors and socket sharing.
> 
> I've got a merge request that implements all three stages as a proof of 
> concept, but I think it's far too much a change to be done at once and should 
> be done carefully stage by stage. One and two are fairly simple to implement, 
> but three requires see careful consideration as to the best approach (this 
> message is long enough already, I don't want to describe them here).
> 
> Does anyone have any feedback on these ideas? Is it worth persuing even if 
> the current implementation works ok-ish?
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com 
> .
> To post to this group, send email to django-developers@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/django-developers 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/CAFNZOJMT9qDk-4pKKXSJysEQCmd6CGxMZBYZs_7BQs_WbAqL6g%40mail.gmail.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 

Re: Refactoring the autoreloader

2017-10-07 Thread Tom Forbes
Thanks for the feedback! Watchdog does include a stat watcher, and I think
we can delegate to that. Currently we turn on the file system notifications
if the library is available, but we should add a '--stat' flag to disable
this it you are running on filesystems that don't support this. At which
point we can delegate to the watchdog stat watcher if available, else
fallback to our current implementation.

There hasn't been much feedback on this, so I'm guessing there isn't much
appetite for these changes. I think improving the current code is a good
idea though, even without the filesystem notifications there is some work
that could be done.

On 3 Oct 2017 00:57, "François Freitag"  wrote:

Hi Tom,

Thank you for the great summary.


> 2. Add support for the "watchdog" library as a replacement for
> pyinotify. Watchdog implements file system notifications for all major
> platforms using a fairly simple API, so we can remove polling and have
> instant reloading. Also support Watchman, a notification Daemon from
> Facebook.

Filesystem polling is required for some setup, such as mounting code using
NFS or rsync, for example using vagrant synced folders[1]. Although it does
not prevent from using the watchdog library, which provides a
PollingObserver[2], I think it's worth keeping that use case in mind. The
PR has a StatReloader which seem able to handle filesystem polling, I would
suggest either keeping it or delegating the polling to watchdog.

[1] https://www.vagrantup.com/docs/synced-folders/nfs.html
[2] https://pythonhosted.org/watchdog/api.html#module-watchdog.
observers.polling

Cheers,
François


On 09/29/2017 12:03 PM, Tom Forbes wrote:

> Hello,
> I've been thinking on and off about how to improve the autoreloader
> implementation and I wanted to gather some feedback on potential solutions.
>
> For some background, Django uses a fairly basic autoreload implementation
> that simply polls the last modified time for loaded Python files once a
> second. While this isn't the most efficient, it does work and has worked
> quite well for a long time. When running manage.py runserver, the
> autoreloader will launch a child "manage.py" with the same arguments and
> this child process actually runs Django and serves requests. To reload, the
> child process exits with exit code 3 and the parent restarts it. The code
> is some of the oldest in Django, with a fair bit of it not touched in 9-12
> years.
>
> While it works (and I'm a believer in "if it isn't broke don't fix it")
> there are some architectural and performance issues with the current code:
>
> - Polling every second is not very efficient
> - Detecting when the child process has exited during startup (i.e problem
> in settings.py) is problematic and the code is rather nasty
> - i18n files are 'reloaded' when they change in a rather hacky way
> (resetting private attributes in another module)
> - There is limited support for extending the current implementation, and
> there are cases during development where the parent autoreloader will
> terminate.
>
> I don't want this email to be too long, so I'm going to summarize what I
> think would be a good approach to tackling these problems.
>
> 1. Refactor the current implementation by removing `pyinotify`, redundant
> python 2 checks and implement a 'file_changed' signal so other parts of
> Django can react to file changes (i.e the i18n module can handle resetting
> it's own state).
> 2. Add support for the "watchdog" library as a replacement for pyinotify.
> Watchdog implements file system notifications for all major platforms using
> a fairly simple API, so we can remove polling and have instant reloading.
> Also support Watchman, a notification Daemon from Facebook.
> 3. Add support for more advanced features, like proper handing of startup
> errors and socket sharing.
>
> I've got a merge request that implements all three stages as a proof of
> concept, but I think it's far too much a change to be done at once and
> should be done carefully stage by stage. One and two are fairly simple to
> implement, but three requires see careful consideration as to the best
> approach (this message is long enough already, I don't want to describe
> them here).
>
> Does anyone have any feedback on these ideas? Is it worth persuing even if
> the current implementation works ok-ish?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com  django-developers+unsubscr...@googlegroups.com>.
> To post to this group, send email to django-developers@googlegroups.com
> .
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/ms
> 

Re: Refactoring the autoreloader

2017-10-02 Thread François Freitag

Hi Tom,

Thank you for the great summary.

> 2. Add support for the "watchdog" library as a replacement for
> pyinotify. Watchdog implements file system notifications for all major
> platforms using a fairly simple API, so we can remove polling and have
> instant reloading. Also support Watchman, a notification Daemon from
> Facebook.

Filesystem polling is required for some setup, such as mounting code 
using NFS or rsync, for example using vagrant synced folders[1]. 
Although it does not prevent from using the watchdog library, which 
provides a PollingObserver[2], I think it's worth keeping that use case 
in mind. The PR has a StatReloader which seem able to handle filesystem 
polling, I would suggest either keeping it or delegating the polling to 
watchdog.


[1] https://www.vagrantup.com/docs/synced-folders/nfs.html
[2] 
https://pythonhosted.org/watchdog/api.html#module-watchdog.observers.polling


Cheers,
François

On 09/29/2017 12:03 PM, Tom Forbes wrote:

Hello,
I've been thinking on and off about how to improve the autoreloader 
implementation and I wanted to gather some feedback on potential solutions.


For some background, Django uses a fairly basic autoreload 
implementation that simply polls the last modified time for loaded 
Python files once a second. While this isn't the most efficient, it does 
work and has worked quite well for a long time. When running manage.py 
runserver, the autoreloader will launch a child "manage.py" with the 
same arguments and this child process actually runs Django and serves 
requests. To reload, the child process exits with exit code 3 and the 
parent restarts it. The code is some of the oldest in Django, with a 
fair bit of it not touched in 9-12 years.


While it works (and I'm a believer in "if it isn't broke don't fix it") 
there are some architectural and performance issues with the current code:


- Polling every second is not very efficient
- Detecting when the child process has exited during startup (i.e 
problem in settings.py) is problematic and the code is rather nasty
- i18n files are 'reloaded' when they change in a rather hacky way 
(resetting private attributes in another module)
- There is limited support for extending the current implementation, and 
there are cases during development where the parent autoreloader will 
terminate.


I don't want this email to be too long, so I'm going to summarize what I 
think would be a good approach to tackling these problems.


1. Refactor the current implementation by removing `pyinotify`, 
redundant python 2 checks and implement a 'file_changed' signal so other 
parts of Django can react to file changes (i.e the i18n module can 
handle resetting it's own state).
2. Add support for the "watchdog" library as a replacement for 
pyinotify. Watchdog implements file system notifications for all major 
platforms using a fairly simple API, so we can remove polling and have 
instant reloading. Also support Watchman, a notification Daemon from 
Facebook.
3. Add support for more advanced features, like proper handing of 
startup errors and socket sharing.


I've got a merge request that implements all three stages as a proof of 
concept, but I think it's far too much a change to be done at once and 
should be done carefully stage by stage. One and two are fairly simple 
to implement, but three requires see careful consideration as to the 
best approach (this message is long enough already, I don't want to 
describe them here).


Does anyone have any feedback on these ideas? Is it worth persuing even 
if the current implementation works ok-ish?


--
You received this message because you are subscribed to the Google 
Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to django-developers+unsubscr...@googlegroups.com 
.
To post to this group, send email to django-developers@googlegroups.com 
.

Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFNZOJMT9qDk-4pKKXSJysEQCmd6CGxMZBYZs_7BQs_WbAqL6g%40mail.gmail.com 
.

For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 

Refactoring the autoreloader

2017-09-29 Thread Tom Forbes
Hello,
I've been thinking on and off about how to improve the autoreloader
implementation and I wanted to gather some feedback on potential solutions.

For some background, Django uses a fairly basic autoreload implementation
that simply polls the last modified time for loaded Python files once a
second. While this isn't the most efficient, it does work and has worked
quite well for a long time. When running manage.py runserver, the
autoreloader will launch a child "manage.py" with the same arguments and
this child process actually runs Django and serves requests. To reload, the
child process exits with exit code 3 and the parent restarts it. The code
is some of the oldest in Django, with a fair bit of it not touched in 9-12
years.

While it works (and I'm a believer in "if it isn't broke don't fix it")
there are some architectural and performance issues with the current code:

- Polling every second is not very efficient
- Detecting when the child process has exited during startup (i.e problem
in settings.py) is problematic and the code is rather nasty
- i18n files are 'reloaded' when they change in a rather hacky way
(resetting private attributes in another module)
- There is limited support for extending the current implementation, and
there are cases during development where the parent autoreloader will
terminate.

I don't want this email to be too long, so I'm going to summarize what I
think would be a good approach to tackling these problems.

1. Refactor the current implementation by removing `pyinotify`, redundant
python 2 checks and implement a 'file_changed' signal so other parts of
Django can react to file changes (i.e the i18n module can handle resetting
it's own state).
2. Add support for the "watchdog" library as a replacement for pyinotify.
Watchdog implements file system notifications for all major platforms using
a fairly simple API, so we can remove polling and have instant reloading.
Also support Watchman, a notification Daemon from Facebook.
3. Add support for more advanced features, like proper handing of startup
errors and socket sharing.

I've got a merge request that implements all three stages as a proof of
concept, but I think it's far too much a change to be done at once and
should be done carefully stage by stage. One and two are fairly simple to
implement, but three requires see careful consideration as to the best
approach (this message is long enough already, I don't want to describe
them here).

Does anyone have any feedback on these ideas? Is it worth persuing even if
the current implementation works ok-ish?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFNZOJMT9qDk-4pKKXSJysEQCmd6CGxMZBYZs_7BQs_WbAqL6g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.