Hi folks, I'm reposting this email in case it didn't get circulated on Monday.
Last week I started a forum thread titled:
ND4 License & server stability problems
(see the orig post down below). This is a followup to it.
We're still having the same problems. Here's some new info:
1. The license error I saw in the "core" dump file was referencing a Visigenic
license:
"WARNING: Your licensed period for the use of this EVALUATION version of the
Software has expired. Further use of the Software is prohibited without the
purchase of a permanent license. You are bound by the terms of the Visigenic
Software, Inc. license agreement that you consented to when you ordered or
installed this Software."
Our ND SysAdmin support person here says we can disregard this harmless message.
Is that true? I'm inclined to believe him because our ND4 app server is still
working.
2. Even though we bounce the ND4 app server every night, we still have the
problem where one particular ND4 app on our HP-UX box gives us the generic
"Could not connect to the app server..." error;
while the other ND4 apps on the same box that talk to the very same database,
work just fine.
I believe I have narrowed the problem down to 2 particular pages in the
problematic app. As long as i don't make any changes on those 2 pages everything
is fine. But if I then modify one of those pages on my devel NT box and FTP it
over to the prod HP-UX box, 90% of the time it will immediately give the "could
not connect..." message page after I bounce the server. However I notice that
sometimes it won't give me this error immed, instead it will let me FTP them
over and work fine, then about 2 days later for no reason at all (I haven't
touched the code) it starts giving the "could not connect..." message page.
Could this have anything do with the # of workers, and how the classes are
loaded?
CURT wrote:
You might want to try project pre-loading. That way, the app server builds
all the objects before it takes on session processing.
-- Curt
Could some one tell me if preloading this problematic project might help? If so,
how do I set it up to preload?
BTW: we have 2 workers on our box, with 8 clients per worker.
3. One other thing I notice on these 2 problematic pages is that they both
contain CSpTransaction code. Although it compiles fine and runs with no errors,
maybe there is something bad lurking around that only comes out after both pages
are accessed.
Thanks ahead for any comments or advice.
Janet
>>> <[EMAIL PROTECTED]> 9/28/1999 11:55:56 AM >>>
Hi folks, (please excuse this long note. I had the detail already typed
up for an internal email.)
Last week, there was a thread about that generic "could not connect to
app
server" msg. It seemed to be related to the ND4 app server not creating
a .ser
file. I had that problem. And was finally about to get a "stable"
version of the
app out to production by playing around with the order in which I made
my
changes (2 pages were trouble, the other pages and DOs were all fine to
tweak
confidently knowing the crazy error would not pop up). I know that
sounds like
bogus superstition, but that's how it went.
Now the app is in Pilot (with only 15 users) running on our production
HP-UX
box talking to Informix7 which also lives on the same box. (perhaps
that's a
problem in itself , as the box isn't a supercruncher by any means.)
Although we have been bouncing the app server every night we still
encounter a
consistent problem every morning around 10:15AM. The app server does
not crash,
however our application loses its abilility to communicate with the
database
server.
NOTE: When this db connectivity prob occurs, oddly, it doesn't affect
our other
3 ND4 apps that run on the same box using the exact same db driver to
talk to
the exact same database. These other apps continue working just fine.
Athough
they currently have little to no traffic. This puzzles me. And it also
scares me
since this is that "tempermental" app that was giving my problems in
development
with the .sid business.
Below are my findings to date about it. I hoping the expired Visigenic
license
is the sole problem but it still seems odd that only the one ND4 app is
affected
when the problem occurs.
If nothing else, I'd like to know:
1. Is it unusual that the ND4 app server's database connection to only
ONE app
would die? While the other apps, work fine?
2. below you'll see I get the warning 'truncating cursor rows fetched
from 9999
to 5000' in the log. The 9999 is the default I set the MaxDisplayRows
property
to, in both the .sdo file, and the corresponding .spg file. Is that a
bad thing
to do?Could it cause problems if ALL my "unlimited length"Repeateds
(which never
return more than 300 rows, and have a reasonable # of columns) have
this
default. Should I be more careful, and use a lower # like 999 when
appropriate?
Thank you kindly for any help. As you can tell, I'm new to this ND4
server admin
stuff.
Janet
The tome continues below...!
Reading thru it you'll learn there appear to be 2 potential problem
areas: an
expired license(see #1 below. I'm waiting for some one here to
investigate it),
configuration problems with the # of workers, clients, etc. (see #5
below).
1. When the app server was bounced last night using ndappsrv stop/start
, a core
dump file was written.
The core was also "touched" when the database connection was lost this
AM.
The core dump mentions an expiring Visigenic license (see the screen
shot pasted
below).
2. I found a reference in the beginning of our ND4 log from way back in
Feb '99
(for a different ND4 app) that also mentioned an expiring license.
The dump
mentions a 6 months license period--and this is approx 6/7 months later
than
Feb.
3. The log indicates that the crash on Friday 10:40AM involved a
generic OS
network error:
System error code is 27 ([CANTOPEN] Unable to open : Exec format
error)
. Vendor error#1 code is 0 ()
Vendor error#2 code is 0 ()
where's the Informix error message for 27 :
-27 - Operating system error
An operating-system error code with the meaning shown was unexpectedly
returned to the database server. Check the documentation for your
operating system to find out what too large might mean in the context
of the current operation.
Prior to getting that generic error, this database warning written to
the log:
'truncating cursor rows fetched from 9999 to 5000'
4. The log indicates that the crash on Monday shortly after 10:06AM,
was
preceeded with this warning:
"no free workers "
then these errors: "Database server is currently processing SQL
task."
"-NetdynPartition DefaultPartitionObjectFactoryImpl has
terminated abnormally java.lang.ThreadDeath"
then at 11:38AM it was giving the same generic error #27(shown above
from Friday).
Before today's crash, the same warning message about the truncating
cursor rows appearred.
Thanks for your help!
Janet
_________________________________________________________________________
For help in using, subscribing, and unsubscribing to the discussion
forums, please go to: http://www.netdynamics.com/support/visitdevfor.html
For dire need help, email: [EMAIL PROTECTED]