Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-14 Thread Rudá Porto Filgueiras
On Thu, Aug 13, 2009 at 5:45 PM, Ross J. Reedstrom wrote:
> On Wed, Aug 12, 2009 at 08:57:39PM -0300, Rudá Porto Filgueiras wrote:
>> On Wed, Aug 12, 2009 at 8:20 PM, Shane Hathaway wrote:
>> > A concurrent pack might provide another explanation.  Have you ever packed
>> > the database?
>>
>> No, there is no pack operation running, only people using Plone to put 
>> content.
>
> Hmm, it's possible for some things in Plone to keep transactions open a
> very long time: refreshing the portal_catalog for example.

Yes, but there is no portal_catalog update too, only normal catalog operations.

>> I first try to use PostgreSQL, but the time to convert ZODB
>> FileStorage was much larger than MySQL, then I decide to use MySQL.
>
> 
>
> I'm interested in knowing why you're moving from ZODB to RelStorage,
> rather than ZEO? How large is your Data.fs, if you don't mind sharing
> that information?

Our Data.fs is not so large, it's about 5GB but it will almost reach
10GB in six months or before.
One important reason to use RelStorage is backend replication
possibilities (we are not using it yet) and to give better performance
when many people put content in Plone at same time.

ZODB is not optmized for large write operations and ZEO will scale
poor than RelStraoge in multiple CPU servers, and I know that IO is
normaly the problem but we have 15K HBA storage.

I really wish some bechmarks scripts to run in our deploy to estimate
more acurately how better is RelStorage than ZEO. Shane benchmark
results was the main information that compare ZEO and RelStorage and
motivate the swicth, and the number of Zope instances that will grow
(new server) are another reason.

> Ross
> --
> Ross Reedstrom, Ph.D.                                 reeds...@rice.edu
> Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
> The Connexions Project      http://cnx.org            fax: 713-348-3665
> Rice University MS-375, Houston, TX 77005
> GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-13 Thread Ross J. Reedstrom
On Wed, Aug 12, 2009 at 08:57:39PM -0300, Rudá Porto Filgueiras wrote:
> On Wed, Aug 12, 2009 at 8:20 PM, Shane Hathaway wrote:
> > A concurrent pack might provide another explanation.  Have you ever packed
> > the database?
> 
> No, there is no pack operation running, only people using Plone to put 
> content.

Hmm, it's possible for some things in Plone to keep transactions open a
very long time: refreshing the portal_catalog for example.

> I first try to use PostgreSQL, but the time to convert ZODB
> FileStorage was much larger than MySQL, then I decide to use MySQL.
 


I'm interested in knowing why you're moving from ZODB to RelStorage,
rather than ZEO? How large is your Data.fs, if you don't mind sharing
that information?

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer & Admin, Research Scientistphone: 713-348-6166
The Connexions Project  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Shane Hathaway
Rudá Porto Filgueiras wrote:
> Yes, it was closed in the midle of transaction, but if you are sure
> there is nothing to do with RelStorage and there is no possible or
> workaround to the RelStoratge MySQL adapter to lead with MySQL bugs or
> resistent do to network failures and recover from this "deadlock"
> situation (or it will an ugly hack), I will move to PostgreSQL.

At this point I can't even think of an ugly hack that would solve this, 
since I can't understand the state MySQL is getting into.  I assume that 
calling RELEASE_LOCK in MySQL has no effect.

> I also began with MySQL 5.0 from CentOS 5.0 distribution and after a
> change to MySQL 5.4.1 and this failure occur on both.

Did you mean CentOS 5.2?  If not, I would upgrade. :-)

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Rudá Porto Filgueiras
On Wed, Aug 12, 2009 at 8:20 PM, Shane Hathaway wrote:
> Rudá Porto Filgueiras wrote:
>>
>> (...)
>>  Module relstorage.adapters.mysql, line 506, in start_commit
>>  Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
>> StorageError: Unable to acquire commit lock
>>
>> I solve the problem restarting all instances, and the site became
>> operational again, but I have some questions:
>>
>> This can be a bug or there is any problem in my enviroment/application?
>> There is another solution to release commit lock without restart all
>> instances?
>
> Perhaps some instance was taking a very long time to finish a transaction
> commit and you didn't notice it.  RelStorage does everything it can to
> minimize the amount of time the commit lock is held (it uses a strategy
> similar to ZEO), but applications are ultimately in control of how long it
> takes to commit a transaction.
>
> A concurrent pack might provide another explanation.  Have you ever packed
> the database?

No, there is no pack operation running, only people using Plone to put content.

> This could also indicate a bug in MySQL.  According to the documentation of
> get_lock(), all locks will be released when connections terminate, but maybe
> you ran into a MySQL bug that causes locks to stick around.

Yeah, it sticks for ever. :-(
Maybe some tunning on inactive connection time in MySQL should help?

> To me, the most plausible explanation is a MySQL bug, since the other
> hypotheses don't explain why one of the connections terminated prematurely.

I suspect it's MySQL related, becouse it happen with Plone 2.5 and
Plone 3.0, RelStorage 1.1.3 and 1.2b2.

> If I were you, I would try the same application with PostgreSQL instead of
> MySQL.  If the bug persists, then at least we know it's not a MySQL bug. :-)

I first try to use PostgreSQL, but the time to convert ZODB
FileStorage was much larger than MySQL, then I decide to use MySQL.

>> But the question remain, why the database connection was not safely
>> closed when tcp_abort fail?
>
> The error message occurring in tpc_abort was "OperationalError: (2006,
> 'MySQL server has gone away')", suggesting that the database connection was
> *already closed*.

Yes, it was closed in the midle of transaction, but if you are sure
there is nothing to do with RelStorage and there is no possible or
workaround to the RelStoratge MySQL adapter to lead with MySQL bugs or
resistent do to network failures and recover from this "deadlock"
situation (or it will an ugly hack), I will move to PostgreSQL.

I also began with MySQL 5.0 from CentOS 5.0 distribution and after a
change to MySQL 5.4.1 and this failure occur on both.

Cheers,

> Shane

-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Shane Hathaway
Rudá Porto Filgueiras wrote:
> (...)
>   Module relstorage.adapters.mysql, line 506, in start_commit
>   Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
> StorageError: Unable to acquire commit lock
> 
> I solve the problem restarting all instances, and the site became
> operational again, but I have some questions:
> 
> This can be a bug or there is any problem in my enviroment/application?
> There is another solution to release commit lock without restart all 
> instances?

Perhaps some instance was taking a very long time to finish a 
transaction commit and you didn't notice it.  RelStorage does everything 
it can to minimize the amount of time the commit lock is held (it uses a 
strategy similar to ZEO), but applications are ultimately in control of 
how long it takes to commit a transaction.

A concurrent pack might provide another explanation.  Have you ever 
packed the database?

This could also indicate a bug in MySQL.  According to the documentation 
of get_lock(), all locks will be released when connections terminate, 
but maybe you ran into a MySQL bug that causes locks to stick around.

To me, the most plausible explanation is a MySQL bug, since the other 
hypotheses don't explain why one of the connections terminated prematurely.

If I were you, I would try the same application with PostgreSQL instead 
of MySQL.  If the bug persists, then at least we know it's not a MySQL 
bug. :-)

> But the question remain, why the database connection was not safely
> closed when tcp_abort fail?

The error message occurring in tpc_abort was "OperationalError: (2006, 
'MySQL server has gone away')", suggesting that the database connection 
was *already closed*.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Rudá Porto Filgueiras
On Wed, Aug 12, 2009 at 5:35 PM, Jürgen
Herrmann wrote:
>
> On Wed, August 12, 2009 22:23, Rudá Porto Filgueiras wrote:
>> I begin to use RelStorage in a production site with Plone 2.5.
>> Everything was running without failures since 01 august 2009.
>> But today after a failure in tpc_abort, all instances conneceted to
>> MySQL can't acquire commit lock.
>>
>> Follow tpc_abort traceback:
>>
>> 2009-08-12T14:12:08 ERROR txn.1115806016 Error in tpc_abort() on
>> manager > object at 0x210f6790> at 654293328>
>> Traceback (most recent call last):
>>   File
>> "/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py",
>> line 533, in _cleanup
>>     rm.tpc_abort(self)
>>   File
>> "/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py",
>> line 628, in tpc_abort
>>     self.manager.tpc_abort(txn)
>>   File
>> "/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/ZODB/BaseStorage.py",
>> line 194, in tpc_abort
>>     self._abort()
>>   File
>> "/usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py",
>> line 710, in _abort
>>     self._rollback_load_connection()
>>   File
>> "/usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py",
>> line 166, in _rollback_load_connection
>>     self._load_conn.rollback()
>> OperationalError: (2006, 'MySQL server has gone away')
>>
>> And after, all instances report this exeption:
>>
>> 2009-08-12T14:21:53 ERROR Zope.SiteErrorLog
>> http://adm.agecom.ba.gov.br/login_form
>> Traceback (innermost last):
>>   Module ZPublisher.Publish, line 121, in publish
>>   Module Zope2.App.startup, line 240, in commit
>>   Module transaction._manager, line 96, in commit
>>   Module transaction._transaction, line 395, in commit
>>   Module transaction._transaction, line 498, in _commitResources
>>   Module ZODB.Connection, line 730, in tpc_vote
>>   Module relstorage.relstorage, line 675, in tpc_vote
>>   Module relstorage.relstorage, line 659, in _vote
>>   Module relstorage.relstorage, line 566, in _prepare_tid
>>   Module relstorage.adapters.mysql, line 506, in start_commit
>>   Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
>> StorageError: Unable to acquire commit lock
>>
>> I solve the problem restarting all instances, and the site became
>> operational again, but I have some questions:
>>
>> This can be a bug or there is any problem in my enviroment/application?
>> There is another solution to release commit lock without restart all 
>> instances?
>>
>> Cheers,
>>
>> --
>> Rudá Porto Filgueiras
>> http://python-blog.blogspot.com
>> ___
>> For more information about ZODB, see the ZODB Wiki:
>> http://www.zope.org/Wikis/ZODB/
>>
>> ZODB-Dev mailing list  -  zodb-...@zope.org
>> http://mail.zope.org/mailman/listinfo/zodb-dev
>>
>
> i just came back from reading relstorage code (research work for
> radosstorage) and the lock is actually held on the mysql server.
> my guess is that the connection drop you experience earlier left
> the lock in place on the mysql-server. obviously the mysql-server
> did not notice you connection dying, otherwise it would have
> released the lock,see
> http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_get-lock

But the MySQL is hosted in the same machine, it's not a network failure. :-(
I supose this query can solve the problem too:

SELECT RELEASE_LOCK(CONCAT(DATABASE(), '.commit'))

> probably restarting mysql would have solved your issue?!

I think so, but I can't test it. :-(

But the question remain, why the database connection was not safely
closed when tcp_abort fail?
I guess MySQL adapter.close raise a close_exception and it's not logged.

But, it's not the first time this issue happen, I saw the same problem
but it's not easy to reproduce.
How to detect that a connection die in the midle of transaction and
has left the commit lock *LOCKED*?

If it can be detect and FLAGED, adapter.close or load_connection
should take care to execute MySQL RELEASE_LOCK?

> regards, jürgen
> --
>>> XLhost.de - eXperts in Linux hosting ® <<
>
> XLhost.de GmbH
> Jürgen Herrmann, Geschäftsführer
> Boelckestrasse 21, 93051 Regensburg, Germany
>
> Geschäftsführer: Volker Geith, Jürgen Herrmann
> Registriert unter: HRB9918
> Umsatzsteuer-Identifikationsnummer: DE245931218
>
> Fon:  +49 (0)800 XLHOSTDE [0800 95467833]
> Fax:  +49 (0)800 95467830
>
> WEB:  http://www.XLhost.de
> IRC:  #xlh...@irc.quakenet.org
>
>



-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Jürgen Herrmann

On Wed, August 12, 2009 22:23, Rudá Porto Filgueiras wrote:
> I begin to use RelStorage in a production site with Plone 2.5.
> Everything was running without failures since 01 august 2009.
> But today after a failure in tpc_abort, all instances conneceted to
> MySQL can't acquire commit lock.
>
> Follow tpc_abort traceback:
>
> 2009-08-12T14:12:08 ERROR txn.1115806016 Error in tpc_abort() on
> manager  object at 0x210f6790> at 654293328>
> Traceback (most recent call last):
>   File
> "/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py",
> line 533, in _cleanup
> rm.tpc_abort(self)
>   File
> "/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py",
> line 628, in tpc_abort
> self.manager.tpc_abort(txn)
>   File
> "/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/ZODB/BaseStorage.py",
> line 194, in tpc_abort
> self._abort()
>   File
> "/usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py",
> line 710, in _abort
> self._rollback_load_connection()
>   File
> "/usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py",
> line 166, in _rollback_load_connection
> self._load_conn.rollback()
> OperationalError: (2006, 'MySQL server has gone away')
>
> And after, all instances report this exeption:
>
> 2009-08-12T14:21:53 ERROR Zope.SiteErrorLog
> http://adm.agecom.ba.gov.br/login_form
> Traceback (innermost last):
>   Module ZPublisher.Publish, line 121, in publish
>   Module Zope2.App.startup, line 240, in commit
>   Module transaction._manager, line 96, in commit
>   Module transaction._transaction, line 395, in commit
>   Module transaction._transaction, line 498, in _commitResources
>   Module ZODB.Connection, line 730, in tpc_vote
>   Module relstorage.relstorage, line 675, in tpc_vote
>   Module relstorage.relstorage, line 659, in _vote
>   Module relstorage.relstorage, line 566, in _prepare_tid
>   Module relstorage.adapters.mysql, line 506, in start_commit
>   Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
> StorageError: Unable to acquire commit lock
>
> I solve the problem restarting all instances, and the site became
> operational again, but I have some questions:
>
> This can be a bug or there is any problem in my enviroment/application?
> There is another solution to release commit lock without restart all 
> instances?
>
> Cheers,
>
> --
> Rudá Porto Filgueiras
> http://python-blog.blogspot.com
> ___
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev@zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
>

i just came back from reading relstorage code (research work for
radosstorage) and the lock is actually held on the mysql server.
my guess is that the connection drop you experience earlier left
the lock in place on the mysql-server. obviously the mysql-server
did not notice you connection dying, otherwise it would have
released the lock,see
http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_get-lock
probably restarting mysql would have solved your issue?!

regards, jürgen
--
>> XLhost.de - eXperts in Linux hosting ® <<

XLhost.de GmbH
Jürgen Herrmann, Geschäftsführer
Boelckestrasse 21, 93051 Regensburg, Germany

Geschäftsführer: Volker Geith, Jürgen Herrmann
Registriert unter: HRB9918
Umsatzsteuer-Identifikationsnummer: DE245931218

Fon:  +49 (0)800 XLHOSTDE [0800 95467833]
Fax:  +49 (0)800 95467830

WEB:  http://www.XLhost.de
IRC:  #xlh...@irc.quakenet.org

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] RelStorage MySQL - StorageError: Unable to acquire commit lock

2009-08-12 Thread Rudá Porto Filgueiras
I begin to use RelStorage in a production site with Plone 2.5.
Everything was running without failures since 01 august 2009.
But today after a failure in tpc_abort, all instances conneceted to
MySQL can't acquire commit lock.

Follow tpc_abort traceback:

2009-08-12T14:12:08 ERROR txn.1115806016 Error in tpc_abort() on
manager  at 654293328>
Traceback (most recent call last):
  File 
"/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py",
line 533, in _cleanup
rm.tpc_abort(self)
  File 
"/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/transaction/_transaction.py",
line 628, in tpc_abort
self.manager.tpc_abort(txn)
  File 
"/usr/local/zope/agecom-virtual/eggs/ZODB3-3.7.3_polling-py2.4-linux-x86_64.egg/ZODB/BaseStorage.py",
line 194, in tpc_abort
self._abort()
  File 
"/usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py",
line 710, in _abort
self._rollback_load_connection()
  File 
"/usr/local/zope/agecom-virtual/eggs/RelStorage-1.2.0b2-py2.4.egg/relstorage/relstorage.py",
line 166, in _rollback_load_connection
self._load_conn.rollback()
OperationalError: (2006, 'MySQL server has gone away')

And after, all instances report this exeption:

2009-08-12T14:21:53 ERROR Zope.SiteErrorLog
http://adm.agecom.ba.gov.br/login_form
Traceback (innermost last):
  Module ZPublisher.Publish, line 121, in publish
  Module Zope2.App.startup, line 240, in commit
  Module transaction._manager, line 96, in commit
  Module transaction._transaction, line 395, in commit
  Module transaction._transaction, line 498, in _commitResources
  Module ZODB.Connection, line 730, in tpc_vote
  Module relstorage.relstorage, line 675, in tpc_vote
  Module relstorage.relstorage, line 659, in _vote
  Module relstorage.relstorage, line 566, in _prepare_tid
  Module relstorage.adapters.mysql, line 506, in start_commit
  Module relstorage.adapters.mysql, line 672, in _hold_commit_lock
StorageError: Unable to acquire commit lock

I solve the problem restarting all instances, and the site became
operational again, but I have some questions:

This can be a bug or there is any problem in my enviroment/application?
There is another solution to release commit lock without restart all instances?

Cheers,

-- 
Rudá Porto Filgueiras
http://python-blog.blogspot.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev