As some of you know, I'm pulling my hair out trying to track down why we have both driver crashes and complete stalls of the SAPDB software. On our production web sites, about every 24 hours we run into a stall of the database server or a crash of the driver. Stalls can last as long as 5 minutes where SAPDB doesn't respond. We believe we have eliminated all hardware / network issues (yes, even DNS).
Our site has a lot of concurrency, so I've been trying to devise "stress tests" that do a lot of small transactions over and over like we do on our web site. The problem seems to only crop up after accumulated usage, pointing to small leaks or other problems... A week or so ago I posted a command line dotNet stress test program that could pretty quickly generate errors out of the ODBC driver. Most common were -709 errors. Now I have a new symptom of the problem. While running my web site testing today, I was looking at DBMCLI and found that I once got hit with an error from DBMCLI! So I decided to stress test DBMCLI. Following is a batch file to run DBMCLI in a loop and have it stop once it hits an error. ==== BEGIN Win32 BATCH FILE ===== @ECHO OFF REM *** REM *** Change the following line to be unique for each run REM *** SET O=lasterr1.txt SET Udbm=dbm,dbm SET DB1=TST SET A= :Top SET A=%A%! IF %A%==!!!!!!!!!!!!!!!!!! GOTO ShowOne dbmcli -n localhost -d %DB1% -u %Udbm% -uSQL -c info state > %O% IF ERRORLEVEL 1 GOTO ERROR1 GOTO Top :ShowOne ERASE %0% dbmcli -n localhost -d %DB1% -u %Udbm% -uSQL -c info state IF ERRORLEVEL 1 GOTO ERROR1 rem *** ping a host you can't find to simulate a sleep comand. ping -w 1000 -n 1 192.187.188.2 SET A= GOTO Top :ERROR1 ECHO *** ECHO *** Error encountered! ECHO *** IF EXIST %0% TYPE %O% ==== END Win32 BATCH FILE ===== Works great, runs for hours without problem. The problem starts when you go ahead and open up a second CMD.EXE prompt and run a second instance of the batch file at the same time. IMPORTANT: To run more than one test at the same time, you need to make a second copy of the BATCH file and revise the O parameter on line 5. The O parameterneeds to be unique for each (concurrent) instance. Example: TEST1.BAT have O=test1.txt and TEST2.BAT have O=test2.txt on line 5. After only a few minutes, I start getting errors like: 1. Error! Connection failed to node localhost for database TST: ERR_USRFAIL: user authorization failed 2. -24988,ERR_SQL: sql error -4008,Unknown user name/password combination 3. Error! Connection failed to node localhost for database TST: could not connect to socket [10048] 4. The syntax of the command is incorrect. Or just run ONE instance at the same time you have a looping ODBC application running, and you start getting these random errors from the ODBC side: ERROR [08001] [SAP AG][SQLOD32 DLL][SAP DB]Unable to connect to data source;-709 CONNECT: (could not connect to socket [10048]) Correct me if I'm wrong, but the DBMCLI isn't using ODBC, so is the problem deeper within SAPDB than the ODBC driver? I notice DBMCLI produces different errors depending on how long things have been running. If you reboot and start fresh, it takes a while for errors to appear -- and the pattern of errors seems to change after the tests have been running for some time. Leaks? Please, if you have time, try to reproduce and track down these problems. Anyone have time to test Linux for the same problem? Thank you. Stephen Gutknecht Renton, Washington USA _______________________________________________ sapdb.general mailing list [EMAIL PROTECTED] http://listserv.sap.com/mailman/listinfo/sapdb.general
