Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Jim Fulton

Tarek Ziadé wrote:

Jim Fulton wrote:



- Look at opprtunities for limited robust reload.  Perhaps we could
 define reloadable modules, especially for defining adapters,
 with restrictions on their definitions and exports in a way
 that allows robust reload.  This would probably be based on the
 persistent-module experiments.  This is a fair bit of deep work
 though and I'm not sure who has the interest and ability to make
 it happen.

I'm really not interested in a reload faclity, like the one commonly
used in Zope 2, that is not robust.  I've wasted too many hours
helping people debug problems that were caused by reload misshaps.



out of curiosity, what are the things that make a reload not robust ?
is it just a matter of dependencies or it's deeper ?


I was hoping that someone else would answer this directly. :)
Shane did largely answer it, but I'll try to be more direct and
concise:

When you reaload a module, the module source is recompiled and
executed.  Values defined in the new version overwrite values of
the same name in the old version.  This has lots of implications:

- Client modules that imported names using from:

from oldmodule import somename

  Don't see the update.

- Instances of classes defined in the module remain instances
  of the old classes.

- Writable global data is overwritten.  This is a common source
  of subtle bugs when a module defines a registry or cache.

There are probably other interesting things I'm not thinking of.

There are ways of working around these issues, but they require
special techniques that aren't always followed or can be defeated.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Jim Fulton

Lennart Regebro wrote:

On 5/9/06, Jim Fulton [EMAIL PROTECTED] wrote:


- Speed up restart.  I think there are a lot of ways that restarts
   can be made faster:


[...]


   o Load less.  A Zope 3 application that only loads what it actually
 uses will load much more quickly than a full Zope 3 checkout.



Just a brainstormy idea:

One thing I like with Python imports which ZCML doesn't do, is that it
only loads things that really are imported.


I don't understand this.  ZCML doesn't get magically loaded unless it
is explicitly included.  ZCML directives only load modules they refer
to.


Maybe there could be a way
to say which products you depend on in ZCML, and only load the ZCML of
these? Kinda like a zcml-import, but not creating problems if you
import it twice?


There is. It's called include.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Jim Fulton

Stephan Richter wrote:

On Tuesday 09 May 2006 07:22, Jim Fulton wrote:


I guess we need to make this a priority for the next release.

Python simply does not support a general robust reload, other than
restart.

I think that there are 2 ways we can make progress in this area:

- Speed up restart.  I think there are a lot of ways that restarts
  can be made faster:

  o Optimizae what we're doing now.  I suspect that there are some
opportunuties here.



I have applied for the SoC with a proposal to enhance ZCML. My proposal is 
attached. It discusses some of the optimization options we talked of before. 


...


  I propose to allow local components to be configurable through ZCML. This
  goal became feasible with the recent component architecture refactorings by
  Jim Fulton. Any site (local or global) can now have a set of base sites that
  are used to provide additional components. I want to allow ZCML to specify
  any number of base sites and add components to it. Here is an example of how
  I imagine it to look like (ZCML)::

site name=my-base-site /


I'll note that I'd like site-name to be a convenience directive that is
equivalent to defining a GlobalComponents object and registering it as
an IComponents utility.  Note that GlobalComponents doesn't exist yet. :)
It should be very similar to BaseGlobalComponents, except that it pickles
and unpickles as a named utilities registered with a BaseGlobalComponents.



configure use-site=my-base-site
  ...
/configure

  The new ``site`` directive creates a new site. The ``configure`` directive
  will grow an ``use-site`` attribute that specifies the site to put in the
  components. By default,


Actually, the site to put the components in. :)


  There will be a new registry of all ZCML-defined sites.


No!  Use the base utility registry.

   All existing ZCML

  directives have to be reviewed and it must be ensured that they are
  multi-site aware. The tricky part of the implementation will be to hook in
  those sites as bases to local sites. It must be ensured that the ZODB can
  load having filesystem-based sub-sites, error handling must be carefully
  considered and an UI must be written.


The pickling aspects are pretty trivial.  I'm not sure what UI you are refering 
to.

...


  Another big problem in Zope 3 is the startup time. Some code profiling has
  determined that most of the time in the startup process is lost in parsing,
  converting and validating ZCML directives with their schemas. Thus, this
  startup problem is not purely a Zope 3 problem, but one that affects
  everyone using ZCML.

  This problem can be addressed in several ways. The most obvious one to Zope
  2 developers would be not to restart the application server, but only reload
  the packages and modules that were affected by the code changes. This
  approach has been used in Zope 2 for many years, but it several serious
  problems and some of the smartest people I know have not been able to
  completely solve the problems. Based on that, I do not think that a proposal
  suggesting this approach would be accepted.

  The second approach is to reduce the ZCML processing time, which could be
  integrated into the reload mechanism for Zope 2. This can be accomplished by
  storing some binary representation of the ZCML, similarly to ``*.pyc`` files
  in Python. Again there are several choices to consider and they should
  probably all be tried. The first solution would be to store a pickle of each
  parsed directive, namely the action and its arguments. There would be one
  pickle file fore each ZCML file. When the ZCML file changed, the pickle
  would be updated. Pickle loading would be much faster than pure ZCML
  loading, since no XML-parsing, value conversion and schema validation would
  be necessary.


Note that this will require a refactoring of ZCML handlers to define picklable
actions.  This will also require refactoring so that work now done by handlers
be defered to action execution.


  On the other hand, ZCML creates actions that are eventually
  executed. Actions are created by executing the directive handlers. Thus the
  optimization in this approach would be even greater. The problem with this
  approach is that not all directives are easily pickable. Directive handlers
  often create new types/classes on the fly. This problem could be solved by
  ensuring that directives only create pickable actions. Clearly, this would
  require a lot more work, since you would have to go through all directives
  to ensure their pickability and also provide fallbacks for 3rd-party
  directives.


Yup.  One possibility is to have a mechanism in which the pickle files
are created only when possible.  That is, we try to pickle the actions
and give up if pickling fails. If we update the core directives to
work this way, that will account for most configuration files and
could provide a significant speedup even if not all configuration files
are handled.  Note 

Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Tarek Ziadé
Jim Fulton wrote:

 out of curiosity, what are the things that make a reload not robust ?
 is it just a matter of dependencies or it's deeper ?


 I was hoping that someone else would answer this directly. :)
 Shane did largely answer it, but I'll try to be more direct and
 concise:

 When you reaload a module, the module source is recompiled and
 executed.  Values defined in the new version overwrite values of
 the same name in the old version.  This has lots of implications:

 - Client modules that imported names using from:

 from oldmodule import somename

   Don't see the update.

 - Instances of classes defined in the module remain instances
   of the old classes.

 - Writable global data is overwritten.  This is a common source
   of subtle bugs when a module defines a registry or cache.

 There are probably other interesting things I'm not thinking of.

 There are ways of working around these issues, but they require
 special techniques that aren't always followed or can be defeated.

 Jim

Ok, thanks for the explanation

Tarek

___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Jim Fulton

Adam Groszer wrote:

Hello Jim,

Tuesday, May 9, 2006, 1:22:30 PM, you wrote:

[snip]
JF Python simply does not support a general robust reload, other than
JF restart.
[snip]

What about pushing the problem then to the lower level, to Python
itself. I think all developers are fighting the same problem, so all
Python developers would benefit from the solution. As I know (that may
be wrong) not many even if any language supports that, so that would
make one big plus point on the Python side also.

As I don't have really deep knowledge of the Python interpreter
itself, I cannot imagine how weird is the idea. Maybe we should ask
Guido to have some thoughts about that.


Go for it.  Perhaps this is a good time, in light of Python 3000.

In the past, Guido has said he simply didn't define Python modules
to support reload.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: reloading modules (was Re: [Zope3-dev] Re: Google SoC Project)

2006-05-12 Thread Jim Fulton

Shane Hathaway wrote:
...
2) Make reloadable code fundamentally different. 


Yes.

 If module X is
supposed to be reloadable, and X creates a module-level global variable 
Y, and module Z imports Y, then Y needs to be decorated in such a way 
that Z's view of Y can change automatically when X is reloaded.


A variation on this theme is cause reload to update anything that
can be exported (rather than replacing it).  Of course, this means that
you couldn't export immutable objects, or oreload shouldn't be allowed
to provide a new value for an immutable variable.

Our work on persistent modules should shed some light on this.

I also think there is a real opportunity in allowing reload to fail.
That is, it should be possible for reload to visibly fail so the user
knows that they have to restart.  Then we only reload when we *know*
we can make changes safely and fail otherwise.  For example, in the common
case of updating a class, we can update the class in place.  If there
aren't any other changes, then we know the reload is safe.

This second approach has subtle limitations, though.  What if Y has the 
value 10 and Z defines a global variable A whose value is (Y**2)?  The 
value of A might need to change when Y changes, but how can we arrange 
for that to happen without making a mess of the code?  I doubt there's 
any reasonable general solution.


Sure, but realize that this isn't unique to reload.  A client needs to
know if something is idempotent or not.  It should not cache the result of
a non-idempotent operation.

Even more subtle is what happens when a reloadable module holds a 
registry of things imported from other modules.  When the module is 
reloaded, should the registry get cleared?  Zope 2's refresh says the 
registry should be cleared, but in practice, this confuses everyone.


It causes pain and suffering.  :)

 To solve this, I think reloadable modules need to have a special global
 namespace.  Everything in the global namespace, as well as everything
 reachable from the global namespace, must be explicit about what happens
 at the time another module imports it or the module is reloaded.  I
 think this could make a refresh mechanism like the one in Zope 2
 reliable.  It has a lot of similarity with persistent modules, but it
 might be simpler.  I haven't thought it all the way through.  The idea
 came to me about halfway through this post. :-)

Here's an idea: When we do a new-improved reload, we:

1. Reevaluate the code in the pyc, getting the original dictionary.

2. Recompile and evaluate the code without writing a new pyc.

3. Compare the old and new dictionaries to find changes.  If we
   don't know how to compare 2 items, we assume a change.  Note
   removing a variable is considered an unsafe change.  Adding a
   variable is assumed to be a safe change as long as a variable of
   the same name hasn't been added to the module dynamically.

4. We consider whether each of the changes is safe.  If any change
   is unsafe, we raise and error, aborting the reload.  A change is
   safe if the original and new variables are of the same type, the
   values are mutable and if we know how to update the old value
   based on the new value.  In addition, for a change to be safe,
   the original value and the value currently in the module must be
   the same and have the same value.  That is, it can't have been
   changed dynamically.

5. We apply the changes and write a new pyc.

This boils down to merging differences at the Python level.
We fail if we don't know how to apply the diff. At that point,
the user knows they need to restart to get the change.

Hm. This feels kind of workable.  It might even make a good PEP
for a safe reload.

What do you think?

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Jim Fulton

Dieter Maurer wrote:

Jim Fulton wrote at 2006-5-9 07:22 -0400:


...
   Finally, there's a lot of interest in generating configuration
   actions in Python, rather than ZCML.  I suspect that avoiding
   XML processing, conversion, and validation might speed startup
   quite a bit.



Moreover, if the component performs is own reregistration
on reload, the Z2 refresh may be possible as well.


This sounds too complex to me.

See my response to Shane's post though.  I think a limited
reload with explicit failures could be very useful and safe.

...


I hear of very few problems here.


Good for you.  I've observed that when reload causes problems,
people are often unaware of it.  People have baffling errors
that eventually clear themselves up (after a restart), increasing
their awe of the ficklness of the Zope gods. ;)

Unfortunately, in the past people have often asked me for help
in debugging strange failures, wasting hours of my time.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-12 Thread Stephan Richter
Hi everyone,

I just discussed those comments with Jim via IRC. The following comments are 
FYI.

On Friday 12 May 2006 08:56, Jim Fulton wrote:
directives have to be reviewed and it must be ensured that they are
multi-site aware. The tricky part of the implementation will be to hook
  in those sites as bases to local sites. It must be ensured that the ZODB
  can load having filesystem-based sub-sites, error handling must be
  carefully considered and an UI must be written.

 The pickling aspects are pretty trivial.  I'm not sure what UI you are
 refering to.

I am referring to the UI that lets you select the IComponents utilities that 
will act as bases for the local site. After the discussion it became clear 
that once the pickling is done correctly, it is no problem.

The second approach is to reduce the ZCML processing time, which could
  be integrated into the reload mechanism for Zope 2. This can be
  accomplished by storing some binary representation of the ZCML, similarly
  to ``*.pyc`` files in Python. Again there are several choices to consider
  and they should probably all be tried. The first solution would be to
  store a pickle of each parsed directive, namely the action and its
  arguments. There would be one pickle file fore each ZCML file. When the
  ZCML file changed, the pickle would be updated. Pickle loading would be
  much faster than pure ZCML loading, since no XML-parsing, value
  conversion and schema validation would be necessary.

 Note that this will require a refactoring of ZCML handlers to define
 picklable actions.  This will also require refactoring so that work now
 done by handlers be defered to action execution.

As I explained to Jim on IRC, I am not proposing pickling the configuration 
actions, but the configuration handler callable and its arguments. For 
functions, this is trivial to do. For complex directives that use classes 
this is a little bit harder, but not much.

We will still have the benefit of saving value conversion and validation, as 
well as XML parsing (though I am not sure whether pickle parsing is faster). 
The approach is also much safer, since it does not depend on the subtleties 
of directives, which is good. Not only are actions often unpickable, but some 
directives also do not generate actions, but do their work directly; this is 
due to some bootstrap issues. An approach pickling actions would miss those 
registrations. The more I think about this, the more I believe this is the 
right approach.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics  Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: reloading modules (was Re: [Zope3-dev] Re: Google SoC Project)

2006-05-12 Thread Shane Hathaway

Jim Fulton wrote:

I also think there is a real opportunity in allowing reload to fail.
That is, it should be possible for reload to visibly fail so the user
knows that they have to restart.  Then we only reload when we *know*
we can make changes safely and fail otherwise.  For example, in the common
case of updating a class, we can update the class in place.  If there
aren't any other changes, then we know the reload is safe.


That's insightful.  Zope 2's refresh really should refuse to reload 
sometimes.  Right now it just trusts whoever wrote the refresh.txt file.



Here's an idea: When we do a new-improved reload, we:

1. Reevaluate the code in the pyc, getting the original dictionary.

2. Recompile and evaluate the code without writing a new pyc.


Reloadable modules better not cause side effects at import time!


3. Compare the old and new dictionaries to find changes.  If we
   don't know how to compare 2 items, we assume a change.  Note
   removing a variable is considered an unsafe change.  Adding a
   variable is assumed to be a safe change as long as a variable of
   the same name hasn't been added to the module dynamically.

4. We consider whether each of the changes is safe.  If any change
   is unsafe, we raise and error, aborting the reload.  A change is
   safe if the original and new variables are of the same type, the
   values are mutable and if we know how to update the old value
   based on the new value.  In addition, for a change to be safe,
   the original value and the value currently in the module must be
   the same and have the same value.  That is, it can't have been
   changed dynamically.


It sounds like populating of any sort of registry in a module would 
prevent the module from being reloaded.  Take this for example:


# module mimestuff.py
content_types = {}
def add_content_type(name, extension):
content_types[name] = extension

As soon as anyone calls add_content_type(), including the module itself, 
the state of the content_type dict changes from the original value. 
That's fine by me if that's what you intended.  Reloading modules 
containing registries never seemed like a good idea to me anyway.



5. We apply the changes and write a new pyc.


The server might not have write access to its code directory.  Maybe we 
can't reload if the server can't write the .pyc, since writing the .pyc 
is required to perform further reloads.



This boils down to merging differences at the Python level.
We fail if we don't know how to apply the diff. At that point,
the user knows they need to restart to get the change.

Hm. This feels kind of workable.  It might even make a good PEP
for a safe reload.


It's certainly an improvement.  It's still possible for other modules to 
retain state based on a reloadable module's old state.  Should we worry 
about that?  Is it something that programmers understand intuitively 
enough that when they run into it, they won't be baffled?


Shane
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: reloading modules (was Re: [Zope3-dev] Re: Google SoC Project)

2006-05-10 Thread Adam Groszer
Hi Shane,

Please have a look at http://www.pythomnic.org/.
As I get it, it puts proxies around 'imported' modules.

My idea would be, without thinking it any further/deeper is what
about putting proxies before any imported stuff. Modules, callables,
variables, everything and evaluate the referenced thing at the right
time, against the right sourcecode. Something like the zope
securityproxy works.

[snip]

-- 
Best regards,
 Adammailto:[EMAIL PROTECTED]
--
Quote of the day:
Nothing is so good as it seems beforehand. 
- George Eliot 

___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-09 Thread Jim Fulton

whit wrote:

Adam Groszer wrote:


I personally am tired of restarting z3 each time I made an error even
if it is just one char mistype. I'm doing now a wx based app, and the
problem is the same... made an error, restart, click 10 times...
It would be also a way to have a developer version which might run
slower.



amen...   In the plone community, we have several influential developers 
who don't use z3 tech I suspect because developing with pythonscript is 
*still faster* than writing views and adapters because one doesn't have 
to reload to see minor code changes.


also, in z2 land, refreshing a product loses all the related z3 
registrations.


being able to dynamically reload without restart would be a huge 
fricking win.


I guess we need to make this a priority for the next release.

Python simply does not support a general robust reload, other than
restart.

I think that there are 2 ways we can make progress in this area:

- Speed up restart.  I think there are a lot of ways that restarts
  can be made faster:

  o Optimizae what we're doing now.  I suspect that there are some
opportunuties here.

  o Load less.  A Zope 3 application that only loads what it actually
uses will load much more quickly than a full Zope 3 checkout.

The Zope 3 checkout has as much as it does to provide a
way to test a range of applications when we modify Zope 3.
We need to have a better way of solving this problem without
such a bloated checkout configuration.

Also, we need to make progress with packaging, to make
it easier for people to get just the components they need.
I wanted to switch to eggs for the 3.3 release, but, sadly,
there wasn't enough time.  I think switching to package-based
distributions and installation should be a top priority for
3.4.

Finally, there's a lot of interest in generating configuration
actions in Python, rather than ZCML.  I suspect that avoiding
XML processing, conversion, and validation might speed startup
quite a bit.

- Look at opprtunities for limited robust reload.  Perhaps we could
  define reloadable modules, especially for defining adapters,
  with restrictions on their definitions and exports in a way
  that allows robust reload.  This would probably be based on the
  persistent-module experiments.  This is a fair bit of deep work
  though and I'm not sure who has the interest and ability to make
  it happen.

I'm really not interested in a reload faclity, like the one commonly
used in Zope 2, that is not robust.  I've wasted too many hours
helping people debug problems that were caused by reload misshaps.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-09 Thread Tarek Ziadé
Jim Fulton wrote:


 - Look at opprtunities for limited robust reload.  Perhaps we could
   define reloadable modules, especially for defining adapters,
   with restrictions on their definitions and exports in a way
   that allows robust reload.  This would probably be based on the
   persistent-module experiments.  This is a fair bit of deep work
   though and I'm not sure who has the interest and ability to make
   it happen.

 I'm really not interested in a reload faclity, like the one commonly
 used in Zope 2, that is not robust.  I've wasted too many hours
 helping people debug problems that were caused by reload misshaps.

out of curiosity, what are the things that make a reload not robust ?
is it just a matter of dependencies or it's deeper ?

Tarek
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-09 Thread Lennart Regebro

On 5/9/06, Jim Fulton [EMAIL PROTECTED] wrote:

- Speed up restart.  I think there are a lot of ways that restarts
   can be made faster:

[...]

   o Load less.  A Zope 3 application that only loads what it actually
 uses will load much more quickly than a full Zope 3 checkout.


Just a brainstormy idea:

One thing I like with Python imports which ZCML doesn't do, is that it
only loads things that really are imported. Maybe there could be a way
to say which products you depend on in ZCML, and only load the ZCML of
these? Kinda like a zcml-import, but not creating problems if you
import it twice?

--
Lennart Regebro, Nuxeo http://www.nuxeo.com/
CPS Content Management http://www.cps-project.org/
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-09 Thread Stephan Richter
On Tuesday 09 May 2006 07:22, Jim Fulton wrote:
 I guess we need to make this a priority for the next release.

 Python simply does not support a general robust reload, other than
 restart.

 I think that there are 2 ways we can make progress in this area:

 - Speed up restart.  I think there are a lot of ways that restarts
    can be made faster:

    o Optimizae what we're doing now.  I suspect that there are some
      opportunuties here.

I have applied for the SoC with a proposal to enhance ZCML. My proposal is 
attached. It discusses some of the optimization options we talked of before. 
If accepted I would work on this and the result would be, naturally, in the 
next Zope 3 and 2 release.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics  Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
==
Enhancing ZCML
==

Name: Stephan Richter
E-Mail: [EMAIL PROTECTED]
IRC Nickname: srichter

How much time do you expect to have for this project? Please list jobs,
summer classes, and/or vacations that you'll need to work around:

  I am a Ph.D. student and do take any classes anymore. I am going to teach
  the first summer session Physics 11 (Calculus-based mechanics) and continue
  my thesis. Overall I think 3-4 weeks of time during the summer are a
  realistic estimate.

Development experience:

  - 9 years of Python experience.
  - 6 years of Zope experience.
  - Zope 3 core developer and release manager.

Please describe your usage experience/familiarity with the project you are
applying for:

  I am a Zope 3 core developer. I am planning to implement new features in
  Zope 3 that will make it easier to develop highly-customizable content
  management systems, like Plone, using Zope 3.

What school do you attend? How many years have you attended there? What is
your specialty/major at the school?

  Tufts University, Somerville, MA. Ph.D. in Physics, started in 2002. I am
  developing models to simulate the immune system repsonse upon SIV/HIV
  infections.

Project Details:

  Project Outline
  ---

  Zope 3 uses a component architecture as one of its most basic building
  blocks. Since Zope 3 is a Web Application server, it must be possible to run
  multiple Web sites on one application instance. The consequence is that Zope
  must be configurable on a site by site basis. Thus an extension of the
  component architecture allows us to define components globally (for all
  sites) and locally (site specific).

  Global components are commonly configured using an XML-based configuration
  language called ZCML. Currently, local components can only be created and
  configured via the Web UI. The advantage of doing so is that local
  components can store their state in the ZODB. But this also means that it is
  very difficult and cumbersome to register local components using regular
  filesystem-based code. Clearly, this functionality is
  sub-optimal. Oftentimes you want to be able to define site-specific (local)
  components from the filesystem. This is particularly true for presentation
  code, where it is often not a requirement that the state must be stored in
  the ZODB.

  I propose to allow local components to be configurable through ZCML. This
  goal became feasible with the recent component architecture refactorings by
  Jim Fulton. Any site (local or global) can now have a set of base sites that
  are used to provide additional components. I want to allow ZCML to specify
  any number of base sites and add components to it. Here is an example of how
  I imagine it to look like (ZCML)::

site name=my-base-site /

configure use-site=my-base-site
  ...
/configure

  The new ``site`` directive creates a new site. The ``configure`` directive
  will grow an ``use-site`` attribute that specifies the site to put in the
  components. By default, ``use-site`` will use the global site. This also
  ensures full backward-compatibility.

  There will be a new registry of all ZCML-defined sites. All existing ZCML
  directives have to be reviewed and it must be ensured that they are
  multi-site aware. The tricky part of the implementation will be to hook in
  those sites as bases to local sites. It must be ensured that the ZODB can
  load having filesystem-based sub-sites, error handling must be carefully
  considered and an UI must be written.

  I believe that this feature is essential for highly customizable
  applications liek Plone. It will allow skinning of sites using
  filesystem-based development, something several people in the Plone
  community have strived for for a long time. I think once the community
  starts to discover the implications of this development, many new
  possibilities will open.

  Another big problem in Zope 3 is the startup time. Some code profiling has
  determined that most of the time in the startup process is lost in parsing,
  converting and validating ZCML directives with their 

reloading modules (was Re: [Zope3-dev] Re: Google SoC Project)

2006-05-09 Thread Shane Hathaway

Adam Groszer wrote:

What about pushing the problem then to the lower level, to Python
itself. I think all developers are fighting the same problem, so all
Python developers would benefit from the solution. As I know (that may
be wrong) not many even if any language supports that, so that would
make one big plus point on the Python side also.

As I don't have really deep knowledge of the Python interpreter
itself, I cannot imagine how weird is the idea. Maybe we should ask
Guido to have some thoughts about that.


I've spent time thinking about this.  Modern operating systems are 
surprisingly good at reloading processes, but in general, it's hard to 
reload pieces of a process.  What's the difference?


I think the difference is in the type of interdependence.  Operating 
systems force processes to talk to each other through high level 
mechanisms like files, streams, sockets, memory mapped I/O, and so on. 
Good programmers understand that processes can die and thus make their 
software resilient to communication channel interruptions.


Within a process, programmers have no such expectation.  Once the 
programmer imports a module, the programmer expects the imported module 
to remain unchanged.  There is rarely any concept that modules are 
actually communicating with each other.  A sticky morass of inter-module 
pointers quickly forms, leaving little hope of reliably reloading 
arbitrary modules.  The operating system has to intervene in order to 
start the process over.


Shared memory makes it possible to link processes at a deeper level, but 
in practice, shared memory is used mostly for threading.  It's no 
coincidence that multiple threads are generally thought of as a single 
process that has to restart together.  Once two processes share 
pointers, it's hard to unbind them.


So I have considered two basic approaches for reliably reloading a module:

1) Code the reloadable module as a pure communication endpoint, treating 
the module almost like a process.  No other modules should import from 
the module; instead, the module should register itself with a framework 
and other modules should talk to the module only through that framework. 
 This is a good approach for writing reloadable application-specific 
plugins.  You can also support clusters of modules that represent a 
single plugin.


The Zope 2 refresh mechanism works quite well with products written this 
way.  Unfortunately, keeping modules free of interdependencies is 
difficult, and that's a major support risk.


2) Make reloadable code fundamentally different.  If module X is 
supposed to be reloadable, and X creates a module-level global variable 
Y, and module Z imports Y, then Y needs to be decorated in such a way 
that Z's view of Y can change automatically when X is reloaded.


This second approach has subtle limitations, though.  What if Y has the 
value 10 and Z defines a global variable A whose value is (Y**2)?  The 
value of A might need to change when Y changes, but how can we arrange 
for that to happen without making a mess of the code?  I doubt there's 
any reasonable general solution.


Even more subtle is what happens when a reloadable module holds a 
registry of things imported from other modules.  When the module is 
reloaded, should the registry get cleared?  Zope 2's refresh says the 
registry should be cleared, but in practice, this confuses everyone.


To solve this, I think reloadable modules need to have a special global 
namespace.  Everything in the global namespace, as well as everything 
reachable from the global namespace, must be explicit about what happens 
at the time another module imports it or the module is reloaded.  I 
think this could make a refresh mechanism like the one in Zope 2 
reliable.  It has a lot of similarity with persistent modules, but it 
might be simpler.  I haven't thought it all the way through.  The idea 
came to me about halfway through this post. :-)


Shane
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Google SoC Project

2006-05-09 Thread Dieter Maurer
Jim Fulton wrote at 2006-5-9 07:22 -0400:
 ...
 Finally, there's a lot of interest in generating configuration
 actions in Python, rather than ZCML.  I suspect that avoiding
 XML processing, conversion, and validation might speed startup
 quite a bit.

Moreover, if the component performs is own reregistration
on reload, the Z2 refresh may be possible as well.

We use the Z2 refresh all the time and are very satisfied.
Of course, with a component (i.e. Product in Z2), all dependent
components have to be refreshed as well. We do this with a little
tool of ourselves. With a decent dependancy spec, almost all
refresh behave as expected.

 ...
I'm really not interested in a reload faclity, like the one commonly
used in Zope 2, that is not robust.  I've wasted too many hours
helping people debug problems that were caused by reload misshaps.

I hear of very few problems here.

-- 
Dieter
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com