> If you compile and refresh asynchronously without having old states
> of the objects not only the classes you basically exchange classes
> and objects in the middle of a request. Ok granted this does not
> happen to often but it can happen!
> So what happens, is that a) the user has to wait in the middle of
> request processing that the atomic compile and refresh is done (or
> not depending what you want to lock there) and then to the worse
> you suddenly in the middle of the request you have the beans and
> classes exchanged.
> Ok this is not too different to what happens if you refresh in
> request level if you dont streamline the requests during the
> compile and refresh cycle.
> So pretty much you end up with one request in an inconsistent
> state and probably errors.
Yes, both beans and classes will be refreshed at any time during the
request, but unfortunately I can't see how that - and please correct me
if I'm wrong as I think you haven't really explained that either -
introduces inconsistency. It's basically just a clone using a different
class, in fact (if it wouldn't raise a ClassCastException) you could
even call something like oldBean.equals(newBean) and it would return
true. Apart from the fact that the new bean is an instance of a
different class, those two versions don't differ at all. Well, the fact
that they're using two different classes might seem like big deal at
first, I mean, if that's not a huge difference, what else is?
However, the refresh operation drops all outdated beans, renderer, etc.
at once, so unless you somehow cache those outdated beans on your own as
well, it won't introduce inconsistencies. The former problem's something
that neither a new request can fix though. I've got the feeling,
however, that you've implemented more aspects differently, I don't know.
> See it that way, bean a references classes b and c, c on a later stage
> loaded dynamically.
> By the time the class of a and b and c gets recompiled c has not been
> loaded, a developer/user hits the refresh at a time the compile is in
> full force or has a running request at the time he still has the old
> reference to a, but then because the classes are exchanged exactly at
> the request b and c get refreshed, b and c are referenced, b is still
> picked up because the old version is in the ram, but c is loaded
> dynamically and not yet in ram, and you might end up with an error
> because something does not match (in the worst case classcast along
> the lines of c cannot be cast to c), because for a and b you are still
> on the old version while c is loaded from the new version.
Okay, I'll tell you how that works in my case, though I'm not really
sure if I got your example entirely right (in fact I am most probably
mistaken). The thing is if class A somehow references class C, C already
has to be loaded at that time - you cannot even load the class A
otherwise. Now if the developer modifies the class C, obviously the
daemon thread will notify the system to refresh all relevant beans. If
it turns out, that there is a relevant bean of a different class (e.g.
the relevant bean somehow has a dependency on something that is of the
type C), the system will tell the reloading class loader in my case to
forcefully reload that particular class (i.e. assuming that the relevant
bean is an instance of the class A, it will also reload A again,
regardless of whether the source actually changed or not). The purpose
of this forceful reload is to correct linkage dependencies, i.e. if the
class A on its own depends on class C (e.g. there's a setter setC(C c)),
it will reload A just in order to ensure that it's using the correct
version of C.
You cannot really implement it in a different way I suppose as otherwise
you've got to take care with the order how you refresh classes and beans
(i.e. determine the class that no other class depends on and refresh
that at first, etc.. - doesn't work for cycles though).
regards,
Bernhard
Werner Punz wrote on 12/12/2009 03:09 PM (GMT):
Bernhard Huemer schrieb:
> Under normal no locking circumstances, the beans get
> replaced in the middle of the request because someone
> else triggered it for the application singleton, which
> is probably fine but somewhat dirty because in some
> cases this might end up with a temporary classcast
> exception which is resolved then at the following
> request cleanly.
Well, you're listing more and more issues that are only valid if you
refresh beans at the beginning of a request. What you're saying is
that the application is in an inconsistent state from the moment you
recompile classes until the beginning of the next request that
refreshes beans, renderer, etc. for which those recompiled classes are
relevant. However, to be more precise you'd have to say that the
application is in an inconsistent state from the moment you recompile
until all the relevant artifacts are refreshed. As you refresh
artifacts only at the beginning of a request, you'll have to somehow
synchronize requests, granted, but that doesn't mean that it's
necessarily also the case if you'd refresh artifacts in your daemon
thread instead. Ensuring that the recompile/refresh operation is an
atomic one is just so much easier, if you don't have to wait for the
next request for the refresh (as - again - that's where you refresh
artifacts).
The main issue here is to avoid inconsistent states as much as possible,
if you do the refreshing asynchronously you just push the
inconsistencies one level up.
I will give an example.
The compile and refresh is atomic ok, that is a common point!
The main issue the application state for the user.
If you compile and refresh asynchronously without having old states of
the objects not only the classes you basically exchange classes and
objects in the middle of a request. Ok granted this does not happen to
often but it can happen!
So what happens, is that a) the user has to wait in the middle of
request processing that the atomic compile and refresh is done (or not
depending what you want to lock there) and then to the worse you
suddenly in the middle of the request you have the beans and classes
exchanged.
Ok this is not too different to what happens if you refresh in request
level if you dont streamline the requests during the compile and refresh
cycle.
So pretty much you end up with one request in an inconsistent state and
probably errors.
Anyway, I have given the solutions for the problem and it does not
matter when you compile, it is either double buffer the classes and
objects or streamline the requests for the time of compile and refresh
the objects!
> What we are talking about here is a 1% corner case which
> imposes 90% extra work in that area, and that is definitely
> a post 1.0 thing to solve.
Granted, but just don't get me wrong. I've never meant to point out
every single tiny, inconvenient and maybe even insignificant issue as
you were the one who brought up the Windows file locking issue (which
I btw. still doubt that it exists as even Windows provides - if I'm
not mistaken and if not specified otherwise - exclusive read, write
and delete access to one process at a time only). What I'm saying is,
yes, there are certain race conditions, but that's at least partly a
result of your "JSP-like" refresh approach.
I still dont think those issues except for a longer waiting time has
anything to do with the jsp like approach, granted you have
to wait for the compiler instead of having it executed parallely (which
is a fraction of a second, but the rest of the problems with the
inconsistencies of the application state are the same, and to the third
you give the developer basically in a single developer environment back
the control when to compile instead of enforcing it.
But as I said that was not even my intention I just had the jsp logic in
my mind when coding it and did not think about asynchronous compile.
But the rest of the application state problems exist in either approach.
All you gain is a faster compile for the sake of taking away the
deveopers control of when to compile exactly in a typical dev environment.
> [...] (the biggest issue simply is the singleton constructs like
> application scoped managed beans, that means double buffer the
> class files so every compile has to go into a separate dir, [...]
Why do you think that you have to use separate directories all the
time? Once the class loader has loaded the class, it's in the main
memory anyway, just reuse the in-memory definition of the class and
then you could basically drop the class file on the file system. What
you mean is probably to somehow freeze the reloading process so that
it only picks up reloaded classes at a certain time, but that doesn't
require you to use separate directories (and again, that's only
required if you refresh artifacts JSP-like).
Not really true, you definitely need a full snapshot, you have
overlooked one corner case:
See it that way, bean a references classes b and c, c on a later stage
loaded dynamically.
By the time the class of a and b and c gets recompiled c has not been
loaded,
a developer/user hits the refresh at a time the compile is in full force
or has a running request at the time he still has the old reference to
a, but then because the classes are exchanged exactly at the request b
and c get refreshed, b and c are referenced, b is still picked up
because the old version is in the ram, but c is loaded dynamically and
not yet in ram, and you might end up with an error because something
does not match (in the worst case classcast along the lines of c cannot
be cast to c), because for a and b you are still on the old version
while c is loaded from the new version.
So it is either, buffer all classes as snapshot in ram for the "compile"
transaction (which with normal classloader logic is only possible for
95% due to the lazy initialisation of classes classloader in fact do) so
that old requests get a consistent state or buffer the classes on the hd
and keep the logic in the classloader down to the bare minimum, so it is
just either ram or diskspace. The other solution is just compile when no
request is going on and block all requests until the compile and replace
is done.
Normal classloader logic can deal with most cases but not with the fully
dynamical part which gets loaded somewhere in the code via loadClass!
But as I said, this is so much logic overhead to cover a cornercase
which is not really that important for a development environment.
The worst case is in this case just a lost request. And if we look at
pure scripting languages, they do not even remotely try to solve this.
If the application logic and data structures go haywire then the
developer has to perform the reboot in those languages!
For example, you could do something like: save the time stamp of the
beginning of the request and only reload class definitions if the last
modified time stamp of the according class file is less than the
previously saved one (i.e. basically if the class file has been
recompiled before the beginning of the current request, use it - which
also means, you won't care about recompiled classes during the
request). However, that's just an idea, I haven't tried it as I don't
have to implement something like that in my case.
I am doing that on bean level to kick through the session and custom
scoped beans, the timestamp part needs a full snapshot of all classes,
but yes that is definitely the way to identify when the transactional
boundary is reached.
> And to go back to the original discussion, the compile trigger
> point is mostly a matter of preferrence, I have to admit doing
> the compile on request start was just because I had jsps
> behavior in mind, when I was coding it, I was not even
> thinking of doing it parallely in the watchdog daemon thread.
.. which is why I told you about the possibility of doing it that way
now. You know, four eyes can see more than two and I really like this
module, I think it could be a great advantage of MyFaces. That's why
I'm trying to suggest improvements as far as possible. ;-)
Yes indeed... and no offence taken.
regards,
Bernhard
Werner Punz wrote on 12/12/2009 10:31 AM (GMT):
Bernhard Huemer schrieb:
I´d rather have a single pretictable triggering point than having
the compiler being triggered continously in unpredictable manner.
A standalone developer can code and save and can cause continous
errors. But at the time he hits refresh, he can be pretty sure that
his code should work (well often it does not but that is a different
matter)
Even if you compile continuously the developer can introduce
mistakes, save them and the application won't pick them up as it
simply doesn't compile anyway - or do you mean runtime errors? Just
thinking about it - apparently it doesn't really matter at which
point you pick up the changes as long as you pick them up at all
(which you do), which basically means, if the developer introduces
runtime errors at runtime it will affect your application regardless
of whether you recompile it JSP-like or not (btw. using the term
"JSP-like" as a way to express how you manage compilation isn't
really precise either as e.g. the Jasper 2 engine provides
background compilation as well - but let's stick with the usual
approach to define what "JSP-like" means).
Anyhow if it works JSP-like in your case, then you can't just treat
users and developers the same. The relationship that any developer
who uses your module is a user of your module doesn't really matter
when it comes to race conditions, so I'd suggest we'll ignore that
fact.
However, what matters is that there are people who issue requests to
the web server, namely the users, and people who actually modify the
source files of those applications, the developers. The problem with
the users requests being the "compilation trigger" is apparently
that you'll have to deal with race conditions as there are multiple
possible request threads. If, however, the developer, or more
precisely said the daemon thread that checks for file modifications,
triggers compilations you've only got one thread - the file
monitoring thread - that could possibly access the compiler, hence
no need for synchronization at all in this case!
Well, we've already talked about it a lot anyway, and it's probably
just a matter of preference, I just wanted to point out some issues
and compare different approaches. Maybe others want to follow that
discussion as well, which is why I'm still responding to this emails
as well
Actually the trigger point of the compiler is really just a matter of
personal preference, but the concurrency issues go way deeper than
that and mostly are singleton related.
We have application scoped, session scoped and request scoped beans.
Well what happens if a compile is done in a middle of a request for
someone who hits the site, this happens in both approaches.
Under normal no locking circumstances, the beans get replaced in the
middle of the request because someone else triggered it for the
application singleton, which is probably fine but somewhat dirty
because in some cases this might end up with a temporary classcast
exception which is resolved then at the following request cleanly.
If you want to solve it cleanly you have various options.
a) Let the requests run out which already are in progress
Then compile and while compilation put any new request on hold
Then let the requests through again.
The compile has to be seen as transaction boundary, everything
before the compile has to be a single unit, which is not mutable,
everything after the compile also.
The problem here starts with long running requests like comet
frameworks issue them, then suddenly the compiler literally has to
wait for ages until it can trigger (until the timeout for the comet
related long running xhr request, if you run for instance on Bayeux
not on websockets which are handled differently).
b) Try to double buffer everything possible so that requests before
and during the compile see a single application state (the biggest
issue simply is the singleton constructs like application scoped
managed beans, that means double buffer the class files so every
compile has to go into a separate dir, double buffer the managed
beans which means the old beans have to be preserved until the last
jsf request has terminated which accesses the current state, so I
even assume we need an unlimited nesting depth of the application
state here.
Just in short terms to sum it up, this is way too much to handle for
my 1.0 version, which is mainly aimed at easing the life of the
developers.
I probably will add solution a) but will make it only optionally
turned on sort of as additional safety net for production sites which
do not run comet over jsf (99% of all sites). I am not aiming for a
100% perfect solution in 1.0 but only for a solution which should
ease the life of the developers by reducing the number of server
restarts as much as possible.
What we are talking about here is a 1% corner case which imposes 90%
extra work in that area, and that is definitely a post 1.0 thing to
solve. After all the entire library is not done with 1.0, 1.0 is just
a first version which aims to solve certain things to some extend.
And we are not talking about rendering the application in an unusable
state but that after compile time users in a multiuser environment
might get an error for exactly one request. A situation which cannot
happen in a single user dev environment entirely.
So hot patching a running server or having multiple developers
programming against a running server might trigger this, but only for
one request only. It simply is not worth it for 1.0 to solve that,
although I am sure some users will run into it, hence this needs to
be documented!
And to go back to the original discussion, the compile trigger point
is mostly a matter of preferrence, I have to admit doing the compile
on request start was just because I had jsps behavior in mind, when I
was coding it, I was not even thinking of doing it parallely in the
watchdog daemon thread.