Shmuel Metz writes:
>IMHO, rolling them out with bugs is even more expensive, especially
>bugs that cause noncompliance with regulations. Testing encompasses
>much more than just performance.

I see we have another "Doctor No" in the audience. :-)

How do YOU know it's "even more expensive"? *Obviously* you don't for an
unspecified hypothetical.

You haven't articulated an inviolable physical constant, like some theories
in physics. This issue is situational, with really no absolutes. As a real
example, I worked with a customer that had a memory leak in their
application program (Java, as it happens), so eventually the heap would be
fully consumed and garbage collection couldn't free up space. They couldn't
quite stomp out the root cause(s) of the memory leak, but they faced a
government-mandated deadline to get their application into service. If they
failed to deliver they'd get fined or potentially even lose the whole
contract. So (upon my recommendation) they worked around that problem by
implementing: two production instances (probably a good idea anyway),
better monitoring (to improve diagnostics so they could track the memory
leak more closely and provide better feedback to development), fully
automated recycling of each production instance before the memory leak
could cause damage, and (optionally, but I think they did it) session
persistence so user state would be retained across instances, avoiding end
user disruptions. That multifaceted mitigation strategy tided them over
nicely until they could get the memory leak fixed, and they did
post-production in a future program update. That was the right decision for
the business, unequivocally. They were very happy, and so was the
government. Was Version 1.0 bug free? Was Version 1.0 as well optimized as
it could have been? No, and no.

As an aside, how many of you are still IPLing periodically due to a bug
corrected three decades ago? That's typically rigid nonsense, too, yet I'm
apparently surrounded by dogmatic operational nonsense. I feel
outnumbered. :-)

If I'm not responsible for making the final decisions, my fundamental
approach is to characterize for management, fairly and candidly, with no
b.s., the risks that I know about and, if possible, list out all the
options. Then let them call the *@#$ play. There are a few bright lines
that cannot be crossed, even by management, but "wait for 100% bug free,
fully performance optimized code" isn't actually one of those bright lines.
Otherwise nothing would ever ship, and "known issues" lists wouldn't even
exist. They do, of course.

One of the major value propositions of the mainframe ecosystem is that its
designers assumed correctly, long ago, that developers -- and operators,
for that matter -- are occasionally fallible, but business must still get
done and done well. Thus at every level in the platform architecture there
are rich, robust defenses against human error. That doesn't mean operators
trying to say "No" to everything "new" is helpful. Quite the opposite.

My views are my own, as always. Even when I seem to be the only rational
voice in the room. :-)

--------------------------------------------------------------------------------------------------------
Timothy Sipples
IT Architect Executive, Industry Solutions, IBM z Systems, AP/GCG/MEA
E-Mail: [email protected]
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to