On 05/11/2012 12:40 AM, Jacopo Cappellato wrote:
Hi Adam,
please see inline:
On May 11, 2012, at 4:52 AM, Adam Heath wrote:
On 04/20/2012 07:53 AM, jaco...@apache.org wrote:
Author: jacopoc
Date: Fri Apr 20 12:53:35 2012
New Revision: 1328356
URL: http://svn.apache.org/viewvc?rev=1328356&view=rev
Log:
The cache of parsed Groovy scripts was not thread safe; this issue, in
instances with several concurrent threads running the same script the first
time (i.e. not cached) could cause the same script parsed multiple times and
then added to the cache (overriding the previous value); this event was causing
the clearing of caches in Freemarker; because of a bug in Freemarker [*] this
could cause a deadlock.
The issue is present on all versions of Freemarker but it is less frequent on
latest version because of the refactoring of caches happened after release
2.3.10.
[*]
https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
Modified:
ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
Modified: ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
URL:
http://svn.apache.org/viewvc/ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java?rev=1328356&r1=1328355&r2=1328356&view=diff
==============================================================================
--- ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
(original)
+++ ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java Fri Apr
20 12:53:35 2012
@@ -127,10 +127,17 @@ public class GroovyUtil {
} else {
scriptClass = parseClass(scriptUrl.openStream(),
location);
}
- if (Debug.verboseOn()) {
- Debug.logVerbose("Caching Groovy script at: " + location,
module);
+ synchronized (parsedScripts) {
+ Class<?> scriptClassCached = parsedScripts.get(location);
+ if (scriptClassCached == null) {
+ if (Debug.verboseOn()) {
+ Debug.logVerbose("Caching Groovy script at: " +
location, module);
+ }
+ parsedScripts.put(location, scriptClass);
+ } else {
+ scriptClass = scriptClassCached;
+ }
}
- parsedScripts.put(location, scriptClass);
}
return scriptClass;
} catch (Exception e) {
@@ -177,10 +184,18 @@ public class GroovyUtil {
Class<?> scriptClass = parsedScripts.get(script);
if (scriptClass == null) {
scriptClass = loadClass(script);
- if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script:
" + script, module);
- parsedScripts.put(script, scriptClass);
+ synchronized (parsedScripts) {
+ Class<?> scriptClassCached = parsedScripts.get(script);
+ if (scriptClassCached == null) {
+ if (Debug.verboseOn()) {
+ Debug.logVerbose("Caching Groovy script: " +
script, module);
+ }
+ parsedScripts.put(script, scriptClass);
+ } else {
+ scriptClass = scriptClassCached;
+ }
+ }
}
-
return InvokerHelper.createScript(scriptClass,
getBinding(context)).run();
} catch (CompilationFailedException e) {
String errMsg = "Error loading Groovy script [" + script + "]: "
+ e.toString();
Er, no, this is the *wrong* way to fix this. parsedScripts is a UtilCache
instance, not a Map, not even a HashMap. It is *designed* to not corrupt
without synchronization.
The proper fix is:
Class scriptClass = parsedScripts.get(script);
if (scriptClass == null) {
scriptClass = genClass();
parsedScripts.putIfAbsent(script, scriptClass);
scriptClass = parsedScripts.get(script);
}
Yeah, you are 100% right that your above is functionally equivalent (it fixes
the bug too) but your is more concise and elegant. The only reason I
implemented a custom check-then-act pattern (this is not a DCL!) is that I
didn't notice that we were using a thread safe class: I know it may sound
silly, but I was working on a similar fix for the freemarker code and in
freemarker they were using an HashMap for the cache so I have implemented a
similar code that I then copied over to OFBiz.
What was the real problem you were attempting to fix? The only issue that
would have happened with the previous code was the case of a static variable
being defined in the parsed script, and during a race condition while parsing,
2 separate classes for the same script being in the system.
The problem was that the same class was parsed by different threads and then
added to the cache multiple times (of course each time overriding the same
value as the key was the same): this is harmless in OFBiz but Freemarker
interpreted this as if the class was changed and this triggered a complete
clearing of cache; and, because of a bug in freemarker this could cause a
deadlock (in freemarker).
As I mentioned, I have sent a fix to Freemarker for this, but if you are
interested to the details you can refer to:
https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
https://sourceforge.net/mailarchive/message.php?msg_id=29148020
Haven't yet looked at those urls, but just based on your description,
the only way to fix code that has broken locking is to wrap it in
synchronized. If you use non-locking techniques, the code being called
might still be run in parallel, which won't help the situation.
Let me look into this further before you this code is changed.
If I find out that there is a fixed freemarker(or, I come up with a
patch), then I assume we'd be for including that fixed freemarker. What
about backporting that to other branches(just the locking fix for
freemarker, plus whatever locking fix in ofbiz)?
.>> This change is in the release branch. I'd like to know the real
reason this was changed in ofbiz, and then my fix applied to the branch.
I can update the trunk and release branches with code that actually takes into
account that the parsedScript variable is an instance of a thread safe class...
then you can review my changes.
ps: I can't recommend it enough, but in addition to the entity model data
resource books, everyone working on ofbiz should also be reading java
concurrency in practice.
"Java Concurrency In Action" is a great book, one of the best technical books I
have ever read, and it is always in my desk.
pps: It is my goal to remove all the DCL in ofbiz; I'd love it if others would
help.
I completely agree with this goal, but, as I mentioned above, the code I wrote
above doesn't have anything to do with the double-checked locking anti-pattern.
But I am with you completely when we talk about properly implementing
concurrency in OFBiz... there is a lot of code that concerns me greatly (see
for example ImageManagementServices.checkExistsImage(...)).
The code above that I originally quoted is doing DCL.