RE: Excessive SQL*Net message from client waits

Karen Morton Thu, 13 Mar 2003 04:28:59 -0800

Not like this nor should it be the "top" event always as seems to be the
case here I don't believe.  And, I know for certain that the client did
everything as quickly as possible during the trace.  Minimal data entry done
and OK buttons clicked without delay...no time out for getting a cup of
coffee in between or anything.  :)


Karen

-----Original Message-----
Zanen
Sent: Thursday, March 13, 2003 2:24 AM
To: Multiple recipients of list ORACLE-L


Hi

Isn't sql*net message from client always sort of on top, because it just
means the rdbms is waiting for the client to send some query/command (user
is not typing/clicking/reading fast enough)


Jack

-----Original Message-----
Sent: donderdag 13 maart 2003 3:19
To: Multiple recipients of list ORACLE-L


Hi All,

I've got a situation where I've collected trace data and am seeing 90% of
total response time is accounted for with the SQL*Net Message From Client
event.  Individual queries within the trace show minimal CPU time used and
no obvious indications of bad SQL being the culprit.  I used the Hotsos
Profiler (way cool) and here's an example of what it shows:

Response Time Component                       Duration         # Calls
Avg        Min      Max
---------------------------------- -------------------- --------------- ----
------- ---------- -------
(i) SQL*Net message from client         500.98s   85.1%           2,757
0.181712s      0.00s   5.91s
(i) unaccounted-for                      23.03s    3.9%
(i) direct path write                    22.38s    3.8%           1,373
0.016300s      0.00s   0.32s
(i) log file sync                        20.70s    3.5%             685
0.030219s      0.00s   0.52s
(i) user-mode CPU                        12.12s    2.1%          12,016
0.001009s      0.00s   1.50s
(i) direct path read                      6.66s    1.1%             985
0.006761s      0.00s   0.09s
(i) db file sequential read               1.09s    0.2%           2,679
0.000407s      0.00s   0.14s
(i) db file scattered read                0.83s    0.1%           2,158
0.000385s      0.00s   0.17s
(i) SQL*Net more data to client           0.50s    0.1%           1,007
0.000497s      0.00s   0.13s
(i) SQL*Net more data from client         0.42s    0.1%               5
0.084000s      0.01s   0.19s
(i) db file parallel read                 0.11s    0.0%              44
0.002500s      0.00s   0.01s
(i) latch free                            0.10s    0.0%              30
0.003333s      0.00s   0.02s
(i) file open                             0.01s    0.0%               8
0.001250s      0.00s   0.01s
(i) SQL*Net message to client             0.00s    0.0%           2,757
0.000000s      0.00s   0.00s
---------------------------------- -------------------- --------------- ----
------- ---------- -------
Total                                   588.93s  100.0%

If you want to see the whole profile please check at
www.morton-consulting.com/pdfs/sqlnetwaitstrace.pdf.

By the way, this single trace is one of 15 that was done and all show the
same SQL*Net waits being, on average, 90% or above of the total time.

The "network guys" did some testing and came back saying that the network
couldn't possibly be the problem (do they ever?).  Here's what they said:
"There are 0 Symptoms and Diagnoses that occur from Physical layer up to
network layer.  In running the trace file in loopback mode using the packet
generator function - I noticed that the average % of utilization across the
network was under 10% utilization. There were only few spikes that reached
10% and only two at 20%.  All spikes seen lasted no more than two seconds.
In analysis of your packet size distribution, it was a perfect bell curved.
In fact, 64 byte sized packets (broadcasts and unicasts) were only 4th in
the number of occurrences (very good!).  Traffic captured showed the mac
layer path clearly through cabinet K and out to the Oracle TNS server.  At
the network layer, you only had IP traffic between the client and the Oracle
TNS server.  You had 267 connection layer Symptoms between the Oracle TNS
server and the client.  263 of which were Ack Too Long.  All were because of
the Oracle TNS Server which had very poor Ack times.   The client's Ack's
averaged 2ms.  You had 1 Windows Frozen and 3 Retransmissions.  At the
session layer you had 2 Diagnoses and 4 Symptoms - all related to the Oracle
TNS Server."

Another test I did was to do a continuous ping from the server to one of the
connected sessions and saved the results to a file.  I saw some some minor
spikes over the period of 30 minutes or so but nothing significant.

The application is not the most efficient in the world (to put it mildly) as
it doesn't use bind variables (at all) and changing the app at this point is
not an option.  The app is in part written in C++ and in part in Delphi.
Every SQL statement is built as a concatenated string and then passed to the
database.  I know bind variable usage would help, particularly with all the
hard parsing but if I can't make app changes, then what?

Another thing the app does is that for every individual user that logs in,
multiple sessions will be spawned.  For a given single user who is logged in
when you look in v$session, you see as many as 10 sessions for that process.
So, in effect, 20 users "look like" 200.

I know, I know....is anybody ill yet?

Can anyone suggest anything that could help pinpoint all that SQL*Net wait
time?  Do you see anything in the trace data that would lead you to suggest
anything (init.ora parameter changes for example)?  I've tried everything I
can think of and am at wits end but still have a screaming client who's
highly upset with the excessive response times.

And one last thing which is kinda related....we changed the
optimizer_index_cost_adj and optimizer_index_caching parameters from their
defaults to 50 (both of them) and saw in our testing that it did make
significant improvements in some cases (the trace profile I've shown you
doesn't really have examples of this).  But...since the client didn't want
to bounce the database, we attempted to make these changes temporarily in a
logon trigger by using execute immediate to do alter session commands to set
the parameters per session logon.  It seemed to work wonderfully but one
application (one out of four) started hanging on startup right after we put
in the trigger.  When the trigger was dropped, the application was OK again.
What the heck is up with that?  I've never seen anything like that and can't
for the life of me figure out what the app could be doing that a logon
trigger would cause to freak out.  Any ideas on that one?

Many thanks in advance for your assistance,
Karen


PS  If you could send me direct replies to this post at
[EMAIL PROTECTED] as well as replying to the list I'd appreciate
it.

--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Karen Morton
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Jack van Zanen
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Karen Morton
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

RE: Excessive SQL*Net message from client waits

Reply via email to