> > > To make matters more interesting, I'm running the code in an OpenVZ > > virtualized container. I can consistently reproduce the issue using > > OpenVZ to a MySQL database in a bunch of configurations -- different > > http server (passenger, thin, etc.), different OSes (Ubuntu, Debian, > > CentOS), and different adapters (MySQL, MySQL2). I have been able to > > eliminate the database as the source of the issue. And even beyond > > that, when I run the same code using the Sequel ORM on a bare metal > > system, I do not see the mutex hang issue. It only occurs under > > virtualization. > > If it is consistently reproducible under virtualization and not > reproducible outside of virtualization, my jump to conclusions mat > leads me to believe it may be due to virtualization-related bugs. :) > It's possible Sequel triggers these bugs where ActiveRecord does not. > Of course, it's also possible that there are race conditions in Sequel > that are very unlikely to be hit on real hardware but more likely to > be hit in virtualization scenarios. > > Hi Jeremy --
Here's what I learned after I sent the previous message: 1. The hang and crash is a TCP/IP issue relating with management of the connection pool and the database. I can see the bytes written on the socket and then the socket time out waiting on the response from the database through tcpdump. I can also see the crash in strace traces attached to running processes. 2. The problem happens consistently and repeatedly under virtualization in a host of situations. 3. I have set the :single_threaded=>true flag in the Sequel.connect() object and it has no noticeable effect on the issue. 4. The issue appears when using the ORM portion of Sequel. Once the Sequel::Model object is called anywhere in the code path, /even if the object is never invoked after creation/, it causes the crash. 5. However, I do not see the crash if the Sequel::Model object is never called. If the code only uses direct SQL with the Sequel DSL on the connection object it works reliably. I was able to pour ~200K transactions through some simple queries. I spent several hours with my DBA this afternoon trying to model how the Sequel connection pooling was communicating with the database. What we theorize from the connection counts on the database side is Sequel is consuming a dead connection out of the pool and hanging. I have tried resetting several of the pool and timeout settings to no noticeable effect. I do know OpenVZ has an open issue with connections hanging in a CLOSE_WAIT state. This may be negatively affecting Sequel when consuming the ORM and may mean Sequel is not as puritanical as it should dealing with the states of connections in the pool before releasing and reconnecting or making the assumption the underlying operating system is acting in good faith when dealing with connection lifecycles. What worries me deeply is this will also affect the working portions of Sequel when connections age and time out when the service is idle. My question is: Is there a specific code path the Sequel::Model object uses that the Sequel::Connect object does not in regards to the connection pool that would cause one form of the library to recycle connections properly and one not? Thank you for your assistance! -- Emily K. Dresner-Thornber [email protected] -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sequel-talk?hl=en.
