I'm using Webware for a web application we're developing and I believe we may have uncovered an obscure bug related to forwarding requests to other servlets. While we were having a penetration test done on the application on our live server, we noticed that the app server would stop responding after a period of time and output an error message of "OSError: [Errno 24] Too many open files". The web server itself still responded, so it is definitely something to do with Webware. We theorized that it had something to do with the volume of traffic being sent at the site during the testing, however we have this web application running on another server that is currently in production use that is not experiencing this problem and it's getting hit with very high traffic as well.
We eventually figured out that it only happened when repeated errors were generated, as was the case during the penetration test when the site was being hammered for cross-site scripting attacks, SQL injection statements and the like. We were able to reproduce the bug on our alpha system by using a simple script to repeatedly access the same page in a loop while making sure we generated an exception during the servlet processing each time. We saw the same error in the output in less than 100 iterations consistently. Upon further investigation, it became apparent that this only happens when accessing a page that ends up forwarding the request to another servlet, which then generates the offending exception. If we accessed a non-forwarding page using this script, it would run seemingly indefinitely (at least well over 500 iterations) even generating an exception and triggering a 500 response each time. I investigated the forwarding code in WebKit/Application.py and found the following code in the includeURL() method in lines 671-683: try: servlet.runTransaction(trans) except EndResponse: pass self.returnServlet(servlet, trans) # Restore everything properly req.popParent() req.setURLPath(currentPath) req._serverSidePath = currentServerSidePath req._serverSideContextPath = currentServerSideContextPath req._contextName = currentContextName trans._servlet = currentServlet All the code past the try/except block does not get executed if an exception is generated during servlet processing. I tried catching the exception in a general except statement, then specifically executing that code before continuing to throw the exception back up the chain and this seemed to eliminate the error. I tried commenting out the various lines and narrowed it down to a single line that seems to be causing this error if not executed. The new code is as follows: try: servlet.runTransaction(trans) except EndResponse: pass ############################ except: trans._servlet = currentServlet raise ############################ self.returnServlet(servlet, trans) # Restore everything properly req.popParent() req.setURLPath(currentPath) req._serverSidePath = currentServerSidePath req._serverSideContextPath = currentServerSideContextPath req._contextName = currentContextName trans._servlet = currentServlet After I did this, I hammered the forwarding page over 900 times with my script without seeing the too many open files error. For some reason, not reassigning the original servlet back to the transaction after a forward somehow causes something to be left open, which eventually generates this error. Unfortunately, I don't have an answer to why it causes this, but our best guess is that not doing this somehow causes the original servlet to get stuck in "limbo" without being returned to the servlet cache. This would in turn cause new servlet objects to be generated for each new request. I guess it's running out of available files for the process before running out of memory? In any case, adding these 3 lines to the Application class seems to fix the problem, but I'm concerned that this is simply a band-aid solution to a deeper rooted problem related to the caching and returning of the servlet objects to the pool. Nearest I can figure is that it has something to do with when the die() method of the transaction object eventually gets called, which dereferences all the attributes associated with the object. If anyone has more experience with the lower level functions of Webware, perhaps they can shed some light on this issue. If this solution is indeed a sound one, then we'll simply continue to use it (perhaps it can make it's way into a future version?) but I want to be sure we're not missing something important. If this is indeed a symptom of a deeper problem, it could potentially show up elsewhere. Thanks for your help! ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Webware-devel mailing list Webware-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/webware-devel