All last week we did testing on both internal and customer environments.
Could never replicate the issue on our internal environments. On the
customer environment the issue is reproducible, however, there is not a
set of steps to reproduce consistently. The issue either does/does not
manifest itself shortly after a restart of the web application.
Therefore, we are doing the following...restart the server, then execute
a use case which should fire the rules, verify the "getConfigValue" rule
completed or did not complete. Then restart the server again and
repeat. Using these steps at the customer site leads to failures on a
small percentage of restarts (< 20%). In short it is reproducible but
is very inconsistent.
We set the IBM JIT Compiler to optimize all methods in the jess package
at scorching level to try to root out any problems due to compiler
optimizations but this did not cause to issue to become 100%
reproducible as hoped. The only explanation which remains is that there
is a concurrency issue during the initialization of the Rete instance or
function execution which causes a working application to suddenly stop
working.
Since this is a production application we had to be proactive on
addressing the problem. We re-wrote the RHS of the "getConfigValue" in
a Userfunction (see below). This should prevent the situation where the
"getConfigValue" rule is fired, but fails to complete its work.
However, we have no confidence that other issues will not occur in other
rules/functions. We will be carrying out additional testing on the new
version which includes the new Userfunction implementation. Any help or
advice you can provide would be very welcome.
/**
* Jess function which pulls a configuration value. The function will be
* registered with each Rete instance created by our application. This
logic was
* formerly implemented as the RHS of a Jess rule but due to some
unknown
* problem with Jess the logic was not completely executed all the time.
We
* decided to re-write as a Java implementation rather than declare as a
* deffunction for the same reasons.
*/
public class GetConfigValueFunction implements Userfunction
{
private volatile IConfigurationLocator configurationLocator;
/**
* Constructor
*
* @param configurationLocator
*/
public GetConfigValueFunction(IConfigurationLocator
configurationLocator)
{
this.configurationLocator = configurationLocator;
}
public Value call(ValueVector vv, Context context) throws
JessException
{
String pid = vv.get(1).stringValue(context);
String system = vv.get(2).stringValue(context);
String subsystem = vv.get(3).stringValue(context);
String name = vv.get(4).stringValue(context);
String result = null;
try
{
result = configurationLocator.getConfigurationValue(system,
subsystem, name);
} catch (Throwable e)
{
if
(Configuration.name_upgradeRejectedTaskCardClosingRequirement.equals(nam
e))
{
result = "0";
} else
{
LOG.error("rules.lookupConfigError", new
Object[]{system, subsystem, name, e.getMessage()}, e);
result = "BadConfig";
}
}
Deftemplate template =
context.getEngine().findDeftemplate(DEFTEMPLATE_CONFIG);
Fact fact = new Fact(template);
fact.setSlotValue("pid", new Value(pid, RU.SYMBOL));
fact.setSlotValue("system", new Value(system, RU.SYMBOL));
fact.setSlotValue("subSystem", new Value(subsystem, RU.SYMBOL));
fact.setSlotValue("name", new Value(name, RU.SYMBOL));
fact.setSlotValue("value", new Value(result, RU.STRING));
context.getEngine().assertFact(fact, context);
return new Value(result, RU.STRING);
}
public String getName()
{
return "get-config-value";
}
/**
* The name of the Jess Deftemplate which will be used to create the
config
* fact
*/
private static final String DEFTEMPLATE_CONFIG = "CONFIG::config";
private static final Log LOG =
LogFactory.getLog(GetConfigValueFunction.class);
}
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Hasan Khan
Sent: Monday, August 11, 2008 11:29 AM
To: [email protected]
Subject: JESS: Jess Problem
Hello,
Sorry to bother you again, I have narrowed down my previous issue to the
following:
My rule is as follows:
(defrule getConfigValue "loads a single configuration value as a fact"
(declare (auto-focus true))
(need-config(pid ?pid)(system ?system)(subSystem
?subSystem)(name ?name)(value ?value))
=>
(printout t "the getConfigValue rules fires" crlf)
(printout t "the system is "?system crlf)
(printout t "the sub-system is " ?subSystem crlf)
(printout t "the name is " ?name crlf)
(printout t "the value is " ?value crlf)
(bind ?configLocator (fetch configurationLocator))
(if(eq ?configLocator nil)
then
(printout t "the configuration locator is null" crlf)
else
(printout t "the configuration locator is not null"
crlf)
)
(try
(printout t "entering the try block of the rule getConfigValue"
crlf)
(bind ?result (call ?configLocator getConfigurationValue ?system
?subSystem ?name))
catch
(printout t "catching exception in getConfigValue" crlf)
(if(eq ?name upgradeRejectedTaskCardClosingRequirement)
then
(bind ?result "0")
else
(printout t "entering the exception message" crlf)
(bind ?exception (call ?ERROR getCause))
(printout t "error while looking up config value
" ?system ", " ?subSystem ", " ?name crlf )
(printout t "message: " (?exception
getMessage)crlf)
(printout t "cause: "(?exception
getCause)crlf)
(bind ?result BadConfig)
)
)
(printout t "asserting the fact" crlf)
(printout t "the result is " ?result crlf)
(assert(config(pid ?pid)(system ?system)(subSystem ?subSystem)(name
?name)(value ?result)))
(printout t "exiting configvalue rule" crlf)
)
During the firing of the above rule the following is the output:
[8/11/08 10:04:20:430 CDT] 00000045 SystemOut O the getConfigValue
rules fires
[8/11/08 10:04:20:436 CDT] 00000045 SystemOut O the system is
MaterialManagement
[8/11/08 10:04:20:436 CDT] 00000045 SystemOut O the sub-system is
MaterialManagement
[8/11/08 10:04:20:436 CDT] 00000045 SystemOut O the name is
routeTaskCardsMaterials
[8/11/08 10:04:20:436 CDT] 00000045 SystemOut O the value is nil
[8/11/08 10:04:20:436 CDT] 00000045 SystemOut O the configuration
locator is not null
[8/11/08 10:04:20:436 CDT] 00000045 SystemOut O entering the try
block of the rule getConfigValue
Also, I had the logging of the specific bean determine whether the
correct value was being sent and that displays as follows:
DEBUG 240038942 [WebContainer : 6]
(fleetcycle.util.LoggingUtility.SecurityAndConfigurationAccess)
2008-08-11 10:04:20,438 - Configuration query:
system=[MaterialManagement], subsystem=[MaterialManagement],
name=[routeTaskCardsMaterials], returning value java.lang.String#c420c42
([Task Cards and Materials].)
This is extremely puzzling to say the least as the code does not enter
any of the printouts after the line:
(bind ?result (call ?configLocator getConfigurationValue ?system
?subSystem ?name))
It is as if there is a blackhole in that call :) Anyhow, please let me
know how this could be possible
Thanks,
Hasan
Confidentiality Notice:
**********************************************
This E-mail and any attachments thereto, are intended only for use by
the addressee(s) named herein and may contain legally privileged and/or
confidential information. If you are not the intended recipient of this
E-mail, you are hereby notified any dissemination, distribution or
copying of this E-mail, and any attachments thereto, is strictly
prohibited. If you receive this E-mail in error, please immediately
notify me by reply E-mail or telephone at (218) 723-7887 and permanently
delete the original and any copy of this E-mail, and any printout
thereof.
--------------------------------------------------------------------
To unsubscribe, send the words 'unsubscribe jess-users [EMAIL PROTECTED]'
in the BODY of a message to [EMAIL PROTECTED], NOT to the list
(use your own address!) List problems? Notify
[EMAIL PROTECTED]
--------------------------------------------------------------------
Confidentiality Notice:
**********************************************
This E-mail and any attachments thereto, are intended only for use by the
addressee(s) named herein and may contain legally privileged and/or
confidential information. If you are not the intended recipient of this E-mail,
you are hereby notified any dissemination, distribution or copying of this
E-mail, and any attachments thereto, is strictly prohibited. If you receive
this E-mail in error, please immediately notify me by reply E-mail or telephone
at (218) 723-7887 and permanently delete the original and any copy of this
E-mail, and any printout thereof.
--------------------------------------------------------------------
To unsubscribe, send the words 'unsubscribe jess-users [EMAIL PROTECTED]'
in the BODY of a message to [EMAIL PROTECTED], NOT to the list
(use your own address!) List problems? Notify [EMAIL PROTECTED]
--------------------------------------------------------------------