Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-14 Thread Rudá Porto Filgueiras
On Thu, Aug 13, 2009 at 5:45 PM, Ross J. Reedstromreeds...@rice.edu wrote:
 On Wed, Aug 12, 2009 at 08:57:39PM -0300, Rudá Porto Filgueiras wrote:
 On Wed, Aug 12, 2009 at 8:20 PM, Shane Hathawaysh...@hathawaymix.org wrote:
  A concurrent pack might provide another explanation.  Have you ever packed
  the database?

 No, there is no pack operation running, only people using Plone to put 
 content.

 Hmm, it's possible for some things in Plone to keep transactions open a
 very long time: refreshing the portal_catalog for example.

Yes, but there is no portal_catalog update too, only normal catalog operations.

 I first try to use PostgreSQL, but the time to convert ZODB
 FileStorage was much larger than MySQL, then I decide to use MySQL.

 snip

 I'm interested in knowing why you're moving from ZODB to RelStorage,
 rather than ZEO? How large is your Data.fs, if you don't mind sharing
 that information?

Our Data.fs is not so large, it's about 5GB but it will almost reach
10GB in six months or before.
One important reason to use RelStorage is backend replication
possibilities (we are not using it yet) and to give better performance
when many people put content in Plone at same time.

ZODB is not optmized for large write operations and ZEO will scale
poor than RelStraoge in multiple CPU servers, and I know that IO is
normaly the problem but we have 15K HBA storage.

I really wish some bechmarks scripts to run in our deploy to estimate
more acurately how better is RelStorage than ZEO. Shane benchmark
results was the main information that compare ZEO and RelStorage and
motivate the swicth, and the number of Zope instances that will grow
(new server) are another reason.

 Ross
 --
 Ross Reedstrom, Ph.D.                                 reeds...@rice.edu
 Systems Engineer  Admin, Research Scientist        phone: 713-348-6166
 The Connexions Project      http://cnx.org            fax: 713-348-3665
 Rice University MS-375, Houston, TX 77005
 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Jürgen Herrmann

On Wed, August 12, 2009 22:23, Rudá Porto Filgueiras wrote:
 I begin to use RelStorage in a production site with Plone 2.5.
 Everything was running without failures since 01 august 2009.
 But today after a failure in tpc_abort, all instances conneceted to
 MySQL can't acquire commit lock.

 Follow tpc_abort traceback:

 2009-08-12T14:12:08 ERROR txn.1115806016 Error in tpc_abort() on
 manager MultiObjectResourceAdapter for ZODB.DB.TransactionalUndo
 object at 0x210f6790 at 654293328
 Traceback (most recent call last):
   File
 /usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py,
 line 533, in _cleanup
 rm.tpc_abort(self)
   File
 /usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py,
 line 628, in tpc_abort
 self.manager.tpc_abort(txn)
   File
 /usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/ZODB/BaseStorage.py,
 line 194, in tpc_abort
 self._abort()
   File
 /usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py,
 line 710, in _abort
 self._rollback_load_connection()
   File
 /usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py,
 line 166, in _rollback_load_connection
 self._load_conn.rollback()
 OperationalError: (2006, 'MySQL server has gone away')

 And after, all instances report this exeption:

 2009-08-12T14:21:53 ERROR Zope.SiteErrorLog
 http://adm.agecom.ba.gov.br/login_form
 Traceback (innermost last):
   Module ZPublisher.Publish, line 121, in publish
   Module Zope2.App.startup, line 240, in commit
   Module transaction._manager, line 96, in commit
   Module transaction._transaction, line 395, in commit
   Module transaction._transaction, line 498, in _commitResources
   Module ZODB.Connection, line 730, in tpc_vote
   Module relstorage.relstorage, line 675, in tpc_vote
   Module relstorage.relstorage, line 659, in _vote
   Module relstorage.relstorage, line 566, in _prepare_tid
   Module relstorage.adapters.mysql, line 506, in start_commit
   Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
 StorageError: Unable to acquire commit lock

 I solve the problem restarting all instances, and the site became
 operational again, but I have some questions:

 This can be a bug or there is any problem in my enviroment/application?
 There is another solution to release commit lock without restart all 
 instances?

 Cheers,

 --
 Rudá Porto Filgueiras
 http://python-blog.blogspot.com
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev


i just came back from reading relstorage code (research work for
radosstorage) and the lock is actually held on the mysql server.
my guess is that the connection drop you experience earlier left
the lock in place on the mysql-server. obviously the mysql-server
did not notice you connection dying, otherwise it would have
released the lock,see
http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_get-lock
probably restarting mysql would have solved your issue?!

regards, jürgen
--
 XLhost.de - eXperts in Linux hosting ® 

XLhost.de GmbH
Jürgen Herrmann, Geschäftsführer
Boelckestrasse 21, 93051 Regensburg, Germany

Geschäftsführer: Volker Geith, Jürgen Herrmann
Registriert unter: HRB9918
Umsatzsteuer-Identifikationsnummer: DE245931218

Fon:  +49 (0)800 XLHOSTDE [0800 95467833]
Fax:  +49 (0)800 95467830

WEB:  http://www.XLhost.de
IRC:  #xlh...@irc.quakenet.org

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Rudá Porto Filgueiras
On Wed, Aug 12, 2009 at 5:35 PM, Jürgen
Herrmannjuergen.herrm...@xlhost.de wrote:

 On Wed, August 12, 2009 22:23, Rudá Porto Filgueiras wrote:
 I begin to use RelStorage in a production site with Plone 2.5.
 Everything was running without failures since 01 august 2009.
 But today after a failure in tpc_abort, all instances conneceted to
 MySQL can't acquire commit lock.

 Follow tpc_abort traceback:

 2009-08-12T14:12:08 ERROR txn.1115806016 Error in tpc_abort() on
 manager MultiObjectResourceAdapter for ZODB.DB.TransactionalUndo
 object at 0x210f6790 at 654293328
 Traceback (most recent call last):
   File
 /usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py,
 line 533, in _cleanup
     rm.tpc_abort(self)
   File
 /usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py,
 line 628, in tpc_abort
     self.manager.tpc_abort(txn)
   File
 /usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/ZODB/BaseStorage.py,
 line 194, in tpc_abort
     self._abort()
   File
 /usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py,
 line 710, in _abort
     self._rollback_load_connection()
   File
 /usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py,
 line 166, in _rollback_load_connection
     self._load_conn.rollback()
 OperationalError: (2006, 'MySQL server has gone away')

 And after, all instances report this exeption:

 2009-08-12T14:21:53 ERROR Zope.SiteErrorLog
 http://adm.agecom.ba.gov.br/login_form
 Traceback (innermost last):
   Module ZPublisher.Publish, line 121, in publish
   Module Zope2.App.startup, line 240, in commit
   Module transaction._manager, line 96, in commit
   Module transaction._transaction, line 395, in commit
   Module transaction._transaction, line 498, in _commitResources
   Module ZODB.Connection, line 730, in tpc_vote
   Module relstorage.relstorage, line 675, in tpc_vote
   Module relstorage.relstorage, line 659, in _vote
   Module relstorage.relstorage, line 566, in _prepare_tid
   Module relstorage.adapters.mysql, line 506, in start_commit
   Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
 StorageError: Unable to acquire commit lock

 I solve the problem restarting all instances, and the site became
 operational again, but I have some questions:

 This can be a bug or there is any problem in my enviroment/application?
 There is another solution to release commit lock without restart all 
 instances?

 Cheers,

 --
 Rudá Porto Filgueiras
 http://python-blog.blogspot.com
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/

 ZODB-Dev mailing list  -  zodb-...@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev


 i just came back from reading relstorage code (research work for
 radosstorage) and the lock is actually held on the mysql server.
 my guess is that the connection drop you experience earlier left
 the lock in place on the mysql-server. obviously the mysql-server
 did not notice you connection dying, otherwise it would have
 released the lock,see
 http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_get-lock

But the MySQL is hosted in the same machine, it's not a network failure. :-(
I supose this query can solve the problem too:

SELECT RELEASE_LOCK(CONCAT(DATABASE(), '.commit'))

 probably restarting mysql would have solved your issue?!

I think so, but I can't test it. :-(

But the question remain, why the database connection was not safely
closed when tcp_abort fail?
I guess MySQL adapter.close raise a close_exception and it's not logged.

But, it's not the first time this issue happen, I saw the same problem
but it's not easy to reproduce.
How to detect that a connection die in the midle of transaction and
has left the commit lock *LOCKED*?

If it can be detect and FLAGED, adapter.close or load_connection
should take care to execute MySQL RELEASE_LOCK?

 regards, jürgen
 --
 XLhost.de - eXperts in Linux hosting ® 

 XLhost.de GmbH
 Jürgen Herrmann, Geschäftsführer
 Boelckestrasse 21, 93051 Regensburg, Germany

 Geschäftsführer: Volker Geith, Jürgen Herrmann
 Registriert unter: HRB9918
 Umsatzsteuer-Identifikationsnummer: DE245931218

 Fon:  +49 (0)800 XLHOSTDE [0800 95467833]
 Fax:  +49 (0)800 95467830

 WEB:  http://www.XLhost.de
 IRC:  #xlh...@irc.quakenet.org





-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Shane Hathaway
Rudá Porto Filgueiras wrote:
 (...)
   Module relstorage.adapters.mysql, line 506, in start_commit
   Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
 StorageError: Unable to acquire commit lock
 
 I solve the problem restarting all instances, and the site became
 operational again, but I have some questions:
 
 This can be a bug or there is any problem in my enviroment/application?
 There is another solution to release commit lock without restart all 
 instances?

Perhaps some instance was taking a very long time to finish a 
transaction commit and you didn't notice it.  RelStorage does everything 
it can to minimize the amount of time the commit lock is held (it uses a 
strategy similar to ZEO), but applications are ultimately in control of 
how long it takes to commit a transaction.

A concurrent pack might provide another explanation.  Have you ever 
packed the database?

This could also indicate a bug in MySQL.  According to the documentation 
of get_lock(), all locks will be released when connections terminate, 
but maybe you ran into a MySQL bug that causes locks to stick around.

To me, the most plausible explanation is a MySQL bug, since the other 
hypotheses don't explain why one of the connections terminated prematurely.

If I were you, I would try the same application with PostgreSQL instead 
of MySQL.  If the bug persists, then at least we know it's not a MySQL 
bug. :-)

 But the question remain, why the database connection was not safely
 closed when tcp_abort fail?

The error message occurring in tpc_abort was OperationalError: (2006, 
'MySQL server has gone away'), suggesting that the database connection 
was *already closed*.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Rudá Porto Filgueiras
On Wed, Aug 12, 2009 at 8:20 PM, Shane Hathawaysh...@hathawaymix.org wrote:
 Rudá Porto Filgueiras wrote:

 (...)
  Module relstorage.adapters.mysql, line 506, in start_commit
  Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
 StorageError: Unable to acquire commit lock

 I solve the problem restarting all instances, and the site became
 operational again, but I have some questions:

 This can be a bug or there is any problem in my enviroment/application?
 There is another solution to release commit lock without restart all
 instances?

 Perhaps some instance was taking a very long time to finish a transaction
 commit and you didn't notice it.  RelStorage does everything it can to
 minimize the amount of time the commit lock is held (it uses a strategy
 similar to ZEO), but applications are ultimately in control of how long it
 takes to commit a transaction.

 A concurrent pack might provide another explanation.  Have you ever packed
 the database?

No, there is no pack operation running, only people using Plone to put content.

 This could also indicate a bug in MySQL.  According to the documentation of
 get_lock(), all locks will be released when connections terminate, but maybe
 you ran into a MySQL bug that causes locks to stick around.

Yeah, it sticks for ever. :-(
Maybe some tunning on inactive connection time in MySQL should help?

 To me, the most plausible explanation is a MySQL bug, since the other
 hypotheses don't explain why one of the connections terminated prematurely.

I suspect it's MySQL related, becouse it happen with Plone 2.5 and
Plone 3.0, RelStorage 1.1.3 and 1.2b2.

 If I were you, I would try the same application with PostgreSQL instead of
 MySQL.  If the bug persists, then at least we know it's not a MySQL bug. :-)

I first try to use PostgreSQL, but the time to convert ZODB
FileStorage was much larger than MySQL, then I decide to use MySQL.

 But the question remain, why the database connection was not safely
 closed when tcp_abort fail?

 The error message occurring in tpc_abort was OperationalError: (2006,
 'MySQL server has gone away'), suggesting that the database connection was
 *already closed*.

Yes, it was closed in the midle of transaction, but if you are sure
there is nothing to do with RelStorage and there is no possible or
workaround to the RelStoratge MySQL adapter to lead with MySQL bugs or
resistent do to network failures and recover from this deadlock
situation (or it will an ugly hack), I will move to PostgreSQL.

I also began with MySQL 5.0 from CentOS 5.0 distribution and after a
change to MySQL 5.4.1 and this failure occur on both.

Cheers,

 Shane

-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Shane Hathaway
Rudá Porto Filgueiras wrote:
 Yes, it was closed in the midle of transaction, but if you are sure
 there is nothing to do with RelStorage and there is no possible or
 workaround to the RelStoratge MySQL adapter to lead with MySQL bugs or
 resistent do to network failures and recover from this deadlock
 situation (or it will an ugly hack), I will move to PostgreSQL.

At this point I can't even think of an ugly hack that would solve this, 
since I can't understand the state MySQL is getting into.  I assume that 
calling RELEASE_LOCK in MySQL has no effect.

 I also began with MySQL 5.0 from CentOS 5.0 distribution and after a
 change to MySQL 5.4.1 and this failure occur on both.

Did you mean CentOS 5.2?  If not, I would upgrade. :-)

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev