Re: [ZODB-Dev] Relstorage and over growing database.

2013-11-14 Thread Martijn Pieters
On Wed, Nov 13, 2013 at 9:24 AM, Jens W. Klein j...@bluedynamics.comwrote:

 Thanks Martijn for the hint, but we are using a history free database, so
 growing does only happen by deleted objects in our case.


Right, that is an important distinction.


 When in history free mode, is it possible to detect deleted objects at
 store-time? This way we could add the zoid at store time to a
 objects_deleted table in order to clean them up later.


No, because multiple references to an object might exist. There is no
reference counting in a ZODB, hence the intensive manual tree traversal job
when garbage collecting.


  Another way to speed up graph traversal would be to store the
 object-references in a field of object_state. At the moment we have to read
 the pickle in order to get the referenced zoids. Storing additional -
 redundant - information might be not perfect, but it would allow to pack/gc
 the database without any knowledge about the state objects structure, i.e.
 using a stored procedure.


That sounds like a feasible idea, at least to me.


 I would like to know what the relstorage experts think about this ideas.


You may want to contact Shane directly, he *may* not be reading this list
actively.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage and over growing database.

2013-11-12 Thread Martijn Pieters
On Mon, Nov 11, 2013 at 9:24 PM, Daniel Widerin dan...@widerin.net wrote:

 Anyone experienced similar problems packing large relstorage databases?
 The graph traversal takes a really long time. maybe we can improve that
 by storing additional information in the relational database?


You should (at least initially) pack *without* GC (set pack-gc true to
false); I packed a humongous RelStorage-backed database before, and packed
to earlier dates in the past first to minimize the amount of data removed
in a single transaction.

Only when we were down to a reasonable size database did we enable garbage
collection.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB

2012-08-28 Thread Martijn Pieters
On Mon, Aug 27, 2012 at 2:37 PM, Vincent Pelletier vinc...@nexedi.com wrote:
 NEO aims at being a replacement for use-cases where ZEO is used, but
 with better scalability (by allowing data of a single database to be
 distributed over several machines, and by removing database-level
 locking), with failure resilience (by mirroring database content among
 machines). Under the hood, it relies on simple features of SQL
 databases (safe on-disk data structure, efficient memory usage,
 efficient indexes).

How does NEO compare to RelStorage? NEO appears to implement the
storage roughly in the same way; store pickles in tables in a SQL
database.

Some differences that I can see from reading your email:

* NEO takes care of replication itself; RelStorage pushes that
responsibility to the database used.
* NEO supports MySQL and sqlite, RelStorage MySQL, PostgreSQL and Oracle.
* RelStorage can act as a BlobStorage, NEO can not.

Anything else different? Did you make any performance comparisons
between RelStorage and NEO?

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Solved!!! Re: unable to import ZODB: class ConflictError, AttributeError

2012-06-14 Thread Martijn Pieters
On Thu, Jun 14, 2012 at 11:38 AM, Ralf Hauenschild
ralf_hauensch...@gmx.de wrote:
 Thank you very much for your help! It finally worked!
 I just deleted the package folders of transaction in the dist-packages
 folder of Python. Then I reinstalled the latest version of transaction and
 it was done :)
 I can now successfully import ZODB.

You probably broke another software package on your system.

The dist-packages directory contains, at least on Debian systems (and
Debian derivatives such as Ubuntu). You may want to repair that!

Instead, use a virtualenv *without* the --no-site-packages switch;
this makes sure your environment does not try to import
globally-installed packages.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Solved!!! Re: unable to import ZODB: class ConflictError, AttributeError

2012-06-14 Thread Martijn Pieters
On Thu, Jun 14, 2012 at 5:14 PM, Paul Winkler sli...@gmail.com wrote:
 That's backwards, I'm sure you just made a typo but I don't want Ralf
 to get confused :)

Yeah, my mistake. *with*, not *without*. Sorry!

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Solved!!! Re: unable to import ZODB: class ConflictError, AttributeError

2012-06-14 Thread Martijn Pieters
On Thu, Jun 14, 2012 at 5:34 PM, Ralf Hauenschild
ralf_hauensch...@gmx.de wrote:
 Unfortunately, my aim using ZODB was, to store dictionaries of a size of
 ~3GB memory size to the hard drive, after the data has been read from there.
 Of course, this makes only sense, if retrieving the dictionaries from ZODB
 back to the RAM goes faster than parsing the original files via readline()
 again.
 This could not be accomplished:
 The transaction.commit() alone took over an our for a dictionary, which was
 initially parsed from a file in 5 Minutes!
 So i guess, using ZODB for large files is not recommended. But in the case
 of small files, my RAM is big enough anyway, so unfortunately ZODB should
 have no use for me at the moment, or should it?

Do not store large items as persistent objects, no. Your options:

* Store as a ZODB blob, if this is one chunk of data.

* Store as a tree of persistent objects if parts need changing over
time. Future commits would be a lot smaller that way. I'd parse out
the original large dictionary in a separate process, perhaps chunk the
commits (so build up the structure over several commits).

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used

2012-04-19 Thread Martijn Pieters
On Thu, Apr 19, 2012 at 17:20, Claudiu Saftoiu csaft...@gmail.com wrote:
 My question is: what could possibly be causing the server to 'lock up', even
 on a
 simple view like 'is_alive', without using any memory or CPU? Is there some
 ZODB
 resource that might be getting gradually exhausted because I'm not handling
 it properly?

I don't know, anything could lock up your site if something is waiting
for a lock, for example.

Use http://pypi.python.org/pypi/z3c.deadlockdebugger to figure out
what the threads are doing at this time. Preferably, trigger the
dump_threads() method of that module on SIGUSR1, like the Zope
signalstack product does (see
http://svn.plone.org/svn/collective/Products.signalstack/trunk/Products/signalstack/__init__.py
for the exact code to bind the signal handler). That'll tell you
exactly what each thread is busy with when you send the signal.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used

2012-04-19 Thread Martijn Pieters
On Thu, Apr 19, 2012 at 18:22, Claudiu Saftoiu csaft...@gmail.com wrote:
 Are there locks that could possibly be used for the 'is_alive' function?
 Here is the
 definition in its entirety.

 In 'configure.zcml':
   view
     view=.views.is_alive
     name=is_alive
     renderer=json
     /
 in 'views.py':
     def is_alive(request):
         return True

 Whatever the problem is, it causes 'is_alive' to take forever, and the CPU
 is not
 spinning at 100%, and memory usage is not high. Could this be a lock
 problem?

I have no idea; all you told is that you use the ZODB, not what server
framework you use to register your views. Is this Grok, Bluebream,
Repoze.BFG, Zope 2 or something else?

I don't think that is_alive would be the cause of this, it looks like
a simple enough view. But if your server uses a thread pool, and all
the other threads are now occupied by something that got locked up,
then it could be that the server is not answering your is_alive
request at all because it is waiting for a thread to free up first.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Migrating from Plone 4.1.4 ZEO data.fs and blobs to Relstorage

2012-04-16 Thread Martijn Pieters
On Fri, Apr 13, 2012 at 02:25, Sam Wilson s...@dotsec.com wrote:
 I have configured zodbconvert to import the data.fs however I cannot
 for the life of me find doco on how to migrate the blob (bushy layout)
 into RelStorage.

Note that I cannot recommend putting ZODB blobs into any relstorage
database apart from Oracle. PostgreSQL doesn't handle large blobs very
well (storing them internally as a series of chunks in a dedicated
table), while MySQL doesn't really support blobs at all.

We have one customer on RelStorage on a Oracle cluster (RelStorage was
commissioned for them, in fact) where all ZODB blobs also live in the
same database. Performance is quite decent but the customer doesn't
try to store more than a few MBs per file.

Better keep them in a NFS setup for most cases.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zeopack error

2012-01-31 Thread Martijn Pieters
On Tue, Jan 31, 2012 at 12:23, Kaweh Kazemi ka...@me.com wrote:
 [ERROR] 2012-01-31T09:20:25: (22455) Error raised in delayed method
 None

Ugh, a quick look at the code reveals that this must've been raised by
a SlowMethodThread handler, and it indeed doesn't provide the
exception info in the log because that's already been cleared.

However, the method that does the logging does get passed the
exception info (it is after all, partially passed back to the packing
script). Could you open ZEO/zrpc/connection.py and edit line 54, the
context looks like this:

def error(self, exc_info):
self.sent = 'error'
log(Error raised in delayed method, logging.ERROR, exc_info=True)
self.conn.return_error(self.msgid, *exc_info[:2])

If you do not know where that file is located, search for the ZODB3
egg path in your bin/zeo file, that's the directory where you'll find
the ZEO code. Change exc_info=True to exc_info=exc_info, so the
error method looks like this:

def error(self, exc_info):
self.sent = 'error'
log(Error raised in delayed method, logging.ERROR, exc_info=exc_info)
self.conn.return_error(self.msgid, *exc_info[:2])

Restart your ZEO server and attempt to pack again. Now it should log
the exception instead of None.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zeopack error

2012-01-31 Thread Martijn Pieters
On Tue, Jan 31, 2012 at 14:20, Kaweh Kazemi ka...@me.com wrote:
 I did the change and here we go:

Right, that's a ZEO bug report right there then; my change should go
into ZEO trunk. Jim, did you catch it?

 I have one assumption - seeing this traceback - but I can't prove if it's 
 correct: one old object (which is candidate to be removed) in the users 
 storage is referencing an object in a different storage (we are using 
 multiple databases) which has been packed away previously. Generally that 
 shouldn't happen, but maybe we have wrongly deleted objects from the second 
 storage and packed it previously.

I think the assumption is incorrect; the method referencesf has the
following docstring:

Return a list of object ids found in a pickle

A list may be passed in, in which case, information is
appended to it.

Only ordinary internal references are included.
Weak and multi-database references are not included.


Note that weak and multi-database refs are ignored here.

A unhashable type error generally means you are trying to use a list
(a mutable, thus unhashable type) as a key in a dictionary. So, in
this case it is trying to load a pickle record that shouldn't be
physically possible in Python; one where a list is used as a key in a
dict.

I am a little at a loss on how to continue from here. You could rig
the .unload() calls in referencesf (in ZODB/serialize) with a try:
except TypeError handler:

u.persistent_load = refs
try:
u.noload()
u.noload()
except TypeError:
pass

That'll result in a shorter references list, and thus the risk that
too many records will be garbage collected. If you are lucky, the type
error is at the end of the pickle and thus not many references will be
missed, and/or the missed references point to objects that are about
to be packed away anyway.

Alternatively, you could dump the pickle in question (variable p) to a
file instead of passing:

except TypeError:
open('/tmp/b0rkenpickle', 'wb').write(p)
raise

Then upload that pickle somewhere for people on this list to analyze.
I cannot promise anyone will, of course, but if someone could and the
pickle is shown to only contain primitive data types (no references at
all) then the pass is certainly going to solve your problem.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zeopack error

2012-01-30 Thread Martijn Pieters
On Mon, Jan 30, 2012 at 13:41, Kaweh Kazemi ka...@me.com wrote:
 Unfortunately I'm not seeing anything useful, which is my problem:

That's because that's not the ZEO server log output, but the output
from zeopack. Your ZEO server keeps logs too, Jim is asking for the
information you'll find there.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Debugging RelStorage hang-ups

2011-10-20 Thread Martijn Pieters
On a test server with a Plone 4.1 upgrade of a client setup, we are
experience regular lock-ups of the 2 instances we run. After a
restart, after a few hours at least one of the instances will be
waiting on Oracle to roll back:

  File /srv/test-plone4/eggs/RelStorage-1.5.0-py2.6.egg/relstorage/storage.py,
line 1228, in poll_invalidations
changes, new_polled_tid = self._restart_load_and_poll()
  File /srv/test-plone4/eggs/RelStorage-1.5.0-py2.6.egg/relstorage/storage.py,
line 1202, in _restart_load_and_poll
self._adapter.poller.poll_invalidations, prev, ignore_tid)
  File /srv/test-plone4/eggs/RelStorage-1.5.0-py2.6.egg/relstorage/storage.py,
line 254, in _restart_load_and_call
self._load_conn, self._load_cursor)
  File 
/srv/test-plone4/eggs/RelStorage-1.5.0-py2.6.egg/relstorage/adapters/oracle.py,
line 322, in restart_load
conn.rollback()

I am a bit at a loss at where to start debugging this. Any hints from anyone?

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Debugging RelStorage hang-ups

2011-10-20 Thread Martijn Pieters
On Thu, Oct 20, 2011 at 15:41, Shane Hathaway sh...@hathawaymix.org wrote:
 - Is Oracle running out of space somewhere, such as the undo/redo logs?

I'll ask, could be.

 - Do rollbacks in Oracle acquire some kind of lock?

No idea.

 - Could RAC be the culprit?  (Synchronous replication always has weird
 edge cases.)

The test rig is not using RAC. In fact, the production environment no
longer uses RAC either.

-- 
Martijn Pieters
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage breaks History tab

2011-09-02 Thread Martijn Pieters
On Thu, Sep 1, 2011 at 22:56, Chris Withers ch...@simplistix.co.uk wrote:
 I see the resulting transactions in both the root Undo tab and the Undo
 tab of the page template, but not in the History tab of the page template.

Without looking, I'd say the history tab relies on the internal
structure of the stock FileStorage implementation, not on public ZODB
APIs. I haven't touched the archetypes History code in 5 years at
least though, so I am not sure about this.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Bug during pack with Relstorage 1.5.0

2011-08-31 Thread Martijn Pieters
2011/8/30 Sylvain Viollon sylv...@infrae.com:
   We already tried that (there was a similar problem in the mail archives, 
 pre 1.5.0).

   But the error stays the same.

   Should we disable gc ?

I guess; disabling GC would avoid touching those tables.

Let's see if Shane can chime in on your particular problem though.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage Blob support and Oracle

2011-06-15 Thread Martijn Pieters
On Tue, Jun 14, 2011 at 22:22, Martijn Pieters m...@zopatista.com wrote:
 I am considering altering the ON COMMIT behaviour of the
 temp_blob_chunk table to TRUNCATE to extend this behaviour to all
 remaining blob rows on commit. However, such events should be rare
 enough, and there is a `vacuumlo` command that can find orphaned blob
 objects in the pg_largeobjects table and clean these out.

I've now made this change. This means that the temp_blob_chunk table
is dropped when a storage is opened, not on commit, and instead it's
contents are truncated on commit and any oids left in there that
didn't make it to blob_chunk are unlinked.

 I'm also writing a unittest that'll exercise the blob storage a little
 more, by creating a file larger than the maximum storable in one
 chunk, then upload, commit and re-download to see if the data survived
 intact. This'll be set to test level 2 to not clog up the normal test
 cycle as generating a 4GB+ random file can take a little while! :-)

See 
http://zope3.pov.lt/trac/changeset/121949/relstorage/branches/postgres_blob_oid

The tests run at test level 2, meaning that they do not run by
default. You need to either set the test level to 2 (-a 2) or use the
--all switch to run all levels regardless. The PostgreSQL version
(2GB+ blob size) takes a healthy quarter of an hour on my laptop to
complete one such test.

 Last but not least I'll need to write a migration script for those
 users of RelStorage 1.5.0b2 already in production.

I have figured out how to do this (see
http://archives.postgresql.org/pgsql-general/2009-01/msg00771.php). I
won't try to consolidate the blob_chunks, so you'll end up with 1MB
oid blobs but that's fine as the download code can handle those
perfectly.

The query just needs to be written and tested.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage Blob support and Oracle

2011-06-15 Thread Martijn Pieters
On Wed, Jun 15, 2011 at 16:23, Martijn Pieters m...@zopatista.com wrote:
 Last but not least I'll need to write a migration script for those
 users of RelStorage 1.5.0b2 already in production.

 I have figured out how to do this (see
 http://archives.postgresql.org/pgsql-general/2009-01/msg00771.php). I
 won't try to consolidate the blob_chunks, so you'll end up with 1MB
 oid blobs but that's fine as the download code can handle those
 perfectly.

 The query just needs to be written and tested.

All done, branch merged into trunk.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage Blob support and Oracle

2011-06-15 Thread Martijn Pieters
On Wed, Jun 15, 2011 at 21:40, Shane Hathaway sh...@hathawaymix.org wrote:
 Cool!  I'll apply your migration script to my buildbots and report back.
  Right now (before migrating) all the bots fail. ;-)

Whoops! :-) I tested the migrations on some test data, so they should
work on your databases without issue.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage Blob support and Oracle

2011-06-10 Thread Martijn Pieters
On Fri, Jun 10, 2011 at 07:17, Shane Hathaway sh...@hathawaymix.org wrote:
 I see your thinking now.  RelStorage will download multiple chunks from
 Oracle but will now only upload one chunk per blob to Oracle.  If others
 want to do the same for PostgreSQL or MySQL, you've set the example of
 how to do it without disruption.

 Thank you for not changing the schema.  I get a lot of complaints these
 days anytime the schema changes.  I ask just one favor: please run some
 manual tests with multi-gigabyte blobs that don't fit in RAM.

I've committed my refactor now.

In testing I ran into some limitations of cx_Oracle; it uses the older
(pre 10.1) OCI APIs that only allow offsets and sizes of up to 4GB
(e.g. using 32-bit unsigned integers). Moreover, it's naive use of
signed integers instead of unsigned ones means that on 32-bit
platforms the largest *LOB write and read offsets it can handle is 2GB
- 1 byte (sys.maxint). I've seen that the author is already aware that
there is a newer API that upgrades the offset and size types to 64-bit
unsigned integers, and I've asked him what the status is of cx_Oracle
supporting that.

In the meantime, to answer your request for a favour: Uploading and
downloading blobs within these limits worked great and memory usage
wasn't impacted at all.

For us these limits are absolutely not a problem. Although our Oracle
setup is performing very nicely when it comes to BLOB reading and
writing (multiple MBs per second), pulling a BLOB of several GB to a
file in your local non-shared blob cache before serving is going to
take too long anyway, and you really want to use a shared blob storage
for that instead.

I've only tested the Oracle blob upload and download methods in the
mover, since that's what changed in this commit. I didn't run the full
test suite as I lacked the required SYS access on the oracle cluster
and didn't have time to set up a local Oracle database. If you could
run the test suite against your Oracle setup that'd be great. :-)

Next I'll look into adding PostgreSQL support too; it has a nicer API
still in that it let's you specify files directly to upload from or
download to.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Relstorage Blob support and Oracle

2011-06-09 Thread Martijn Pieters
Hi,

We've looked over the RelsStorage ZODB Blob storage implementation and
came to the conclusion that the current use of blob chunks is
unnecessary in Oracle when using the cx_Oracle database connector. Not
splitting ZODB Blobs into chunks may have performance benefits on the
Oracle side (especially on 11g) as Oracle can then use read-ahead more
efficiently for the larger BLOBs while streaming these back to the
client.

I'll be refactoring the blob support currently found in RelStorage
1.5b2 to just store one blob in one row, using the cx_Oracle
LOB.read() and .write() methods, which let you read and write to blobs
in chunks to avoid memory overload, and I'll reuse the blob chunk size
to determine how much we read / write per iteration.

I am currently leaning towards dropping the chunk column in the Oracle
schema altogether; it certainly won't hold any other value than
integer 0 after my refactor. Any reason to keep it, other than that
others whom already are using 1.5b2 on an Oracle database will now
have to drop that column again (or set it to a default value of 0 on
insert)? Should the code support reading blob chunks still?

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Relstorage Blob support and Oracle

2011-06-09 Thread Martijn Pieters
On Thu, Jun 9, 2011 at 22:03, Martijn Pieters m...@zopatista.com wrote:
 I'm retaining the schema; there is a chance people have updated to
 1.5b2 already and are using blobs in production. My refactor maintains
 compatibility with the chunked blob storage.

I've attached my patch for review, if you like. I'll be testing it
tomorrow against Oracle before committing.

-- 
Martijn Pieters


oracle_blobs.diff
Description: Binary data
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Pack complete: 80% of a 425GB database packed

2011-03-01 Thread Martijn Pieters
On Mon, Feb 28, 2011 at 22:12, Shane Hathaway sh...@hathawaymix.org wrote:
 And unless I am missing something, we don't
 need to worry about the transactions table either. The rest of
 RelStorage ignores transactions where the packed flag has been set to
 TRUE, so deleting the packed and empty transactions from the table
 will never lead to deadlocks, right?

 It might lead to deadlocks, actually.  Deadlocks occur when concurrent
 clients try to acquire locks in different orders.  SQL databases handle
 deadlocks of that type by returning an error to all but one of the
 deadlocking clients, similar to the way ZODB deals with conflict errors.

 Also, in SQL, even reading a table usually acquires a lock, so it's
 really hard to prevent deadlocks without something like a commit lock.

 Or am I missing something about
 how RelStorage implements polling and caching? In any case, the
 history-free version doesn't need to lock at all, because it only ever
 touches the pack-specific tables in the pack cleanup.

 True.

 If the
 transaction table needs no lock protection either, we can get rid of
 the lock during cleanup altogether. I'd like to hear confirmation on
 this though.

 I wouldn't do that.  Too risky.  (Many database architects say that
 deadlocks are normal and should be handled in application code, but in
 my experience, deadlocks kill applications under load and I prefer to
 put in the effort to avoid them entirely if possible.)

Okidoki, I'll happily defer to you on that. With our database now much
much smaller I fear that commit lock during pack clean-up less and
less.

 On a side note: I see that the history-preserving object_ref and
 object_refs_added deletion SQL statements have been optimized for
 MySQL, but not for the history-free version. Wouldn't those statements
 not also benefit from using a JOIN?

 Perhaps, but I prefer not to optimize until I see what happens in
 practice.  Mis-optimized MySQL statements frequently turn out to have
 something like O(n^3) performance.

I don't really care about that edge-case, I just noticed that the
MySQL statements were optimized in one version but not the other. I'll
leave tinkering with that to people that actually run RelStorage on
MySQL in the first place, I'll focus on the Oracle side.

I've added a switch to disable updating the schema during RelStorage
start-up, btw. The auto-update of the PL/SQL package bit me at some
point, disabling our cluster. Luckily a swift restart of one of the
clients reinstated the 1.4 version quickly. :-) We are not quite ready
for a full 1.5.0 upgrade just yet.. ;-)

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-28 Thread Martijn Pieters
On Fri, Feb 25, 2011 at 20:16, Shane Hathaway sh...@hathawaymix.org wrote:
 My Buildbot is reporting success with all 3 databases on several
 platforms, so it's looking good.  It's having trouble on Windows, but I
 suspect that's a build bug, not a software bug.  RelStorage supports
 MySQL 5.1, but not 5.5 yet.

The nowait locking strategy worked beautifully on our mammoth pack, so
I've committed it to RelStorage trunk. I didn't get a chance to test
it against MySQL 5.1 though, but the MySQL changes were minimal and
mirrored those for the Oracle version.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Pack complete: 80% of a 425GB database packed

2011-02-28 Thread Martijn Pieters
Early this morning, after packing through the weekend, our somewhat
overweight Oracle RelStorage ZODB pack was completed. I am still
waiting for the final size from the customer DBAs, but before the pack
this beast was occupying 425GB. The pack removed 80% of the
object_state rows, so hopefully this is now reduced to a more
manageable 85GB or so.

The nowait locking strategy worked very well. We did have a few
transactions that covered a huge amount of object states, and these
managed to lock the database up for 10-15 minutes at a time, forcing
me to abort the pack at some point on Saturday to prevent the Zope
cluster from being unresponsive. These transactions would have locked
up the database like this wether we used the nowait strategy or the
duty-cycle, I am not sure what we could have done about this problem
short of not letting the ZODB and transaction sizes get out of hand in
the first place.

I've created a spreadsheet with the data from the pack log info, to
visualize for ourselves what the database looked like in terms of
object states per transaction and such. Note that the timestamps up
until row 368 reflect already packed transactions (that's how far we
got before aborting on Saturday), after that the times reflect how
long it took to remove object states for around 4000 transactions at a
time. See:

  
https://spreadsheets.google.com/ccc?key=0Aqf3DYYXSZ6RdEdkUF9lTFBKSEtkNFVqNEo2b3lnTnchl=en_GBauthkey=COip9qsL

The final pack cleanup took 2.5 hours (4:40: cleaning up, 07:08:
finished successfully). I've been looking a bit closer at stage and am
wondering if that stage really needs to hold the commit lock. Holding
the commit lock for such a long time (at least for the bulk of those
2.5 hours) is a Really Bad Idea, and I was lucky the pack completed
within the maintenance window. One hour later and the cluster would
have been affected hugely.

I think we can remove holding the commit lock during the pack cleanup
altogether. For starters, the object_ref and object_refs_added tables
are only ever used during packing, so the packing lock is more than
enough to protect these. And unless I am missing something, we don't
need to worry about the transactions table either. The rest of
RelStorage ignores transactions where the packed flag has been set to
TRUE, so deleting the packed and empty transactions from the table
will never lead to deadlocks, right? Or am I missing something about
how RelStorage implements polling and caching? In any case, the
history-free version doesn't need to lock at all, because it only ever
touches the pack-specific tables in the pack cleanup.

I'll move the commit lock down to the section where empty transactions
are deleted, and remove it from the history-free version. If the
transaction table needs no lock protection either, we can get rid of
the lock during cleanup altogether. I'd like to hear confirmation on
this though.

On a side note: I see that the history-preserving object_ref and
object_refs_added deletion SQL statements have been optimized for
MySQL, but not for the history-free version. Wouldn't those statements
not also benefit from using a JOIN?

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] blobs missing with relstorage and small blob cache dir

2011-02-28 Thread Martijn Pieters
On Mon, Feb 28, 2011 at 16:22, Hanno Schlichting ha...@hannosch.eu wrote:
 Blobs are considered experimental in ZODB 3.8. Especially the entire
 blob cache changed completely for ZODB 3.9.

 I think you might want to upgrade to Plone 4 and ZODB 3.9 to get a
 stable environment. Or you'll likely run into more problems.

Actually, RelStorage has it's own cache cleanup code for the
database-stored blobs scenario. And by the looks of it, it does
trigger a blob cache cleanup (in a separate thread) during blob load.

Maurits, see blobhelper.py, BlobHelper.download_blob calls
BlobCacheChecker.loaded with check=True (the default). Try calling it
with check=False instead.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] blobs missing with relstorage and small blob cache dir

2011-02-28 Thread Martijn Pieters
On Mon, Feb 28, 2011 at 17:09, Maurits van Rees
m.van.r...@zestsoftware.nl wrote:
 That works: the blobs stay.  But then it looks like the blob cache size
 is only checked once when starting up the zope instance.  For example
 with a blob cache size of 100,000 bytes and two images of about 65K and
 one of about 1MB, I don't see any blobs getting removed from the cache
 while visiting those images or pages that show them.

There is a cache cleanup run during transaction vote as well, but it
looks like that only is applied when there are blob changes to commit.

But we've now at least established that the cache cleanup in
RelStorage is to blame here, so we can focus on
_check_blob_cache_size, specifically on line 490 and further.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Large ZODB packing and progress

2011-02-25 Thread Martijn Pieters
Last night we used our two-phase pack to start packing the largest
Oracle RelStorage ZODB we run. Some statistics first:

* Never packed before in it's 2 year existence.
* Has more than 4.5 million transactions, 52 million object states.
* Packing to 60 days ago means we'll loose 4 million transactions, and
41.5 million states.

Packing ran through the night, but I had to abort the pack this
morning as I am still skittish about holding commit locks for packing
during normal operations. I'm confident we can resume the pack though,
as we can skip the analysis phase, and just have the pack run re-do
the already packed transactions quickly (all DELETE sql statements
will operate on already deleted rows, should be fast).

In the 8 hours that we actually packed, almost 300k transactions were
processed. That's only 7 percent of the whole, so I may want to tune
the duty cycle to a more aggressive setting. Some 8 million object
states were deleted though, because the earliest transactions in this
database comprised of massive imports of data from the system we
replaced.

We encountered only 2 problems: The Oracle cluster is set up to create
redo logs for transactions to be backed up from a staging area. We
generated 35 GB (estimated) of such logs, so you need to make sure
there is enough space allocated for these files. My initial analysis
runs from a day earlier managed to overrun this space and Oracle
ground to a screeching halt. The limits were lifted and we kept a
close eye on the Oracle v$recovery_file_dest view after that.

The other problem is that the packing process offers no progress
information between the will pack and cleaning up lines. You could
turn up the log level to 'debug' at which point you'll get lines for
*every* transaction to be packed, and at 4 million such transactions
that's a bit too much for me. I've added INFO level logging that
provides progress feedback instead, at most every .1% of the total
number of transactions. See:

  https://bitbucket.org/mjpieters/relstorage-mq/src/tip/packingprogress.patch

I think I'll commit that progress patch to RelStorage SVN after we
tried it on for size, in the next few days.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-25 Thread Martijn Pieters
On Fri, Feb 25, 2011 at 12:48, Shane Hathaway sh...@hathawaymix.org wrote:
 Do your best and I'll take a look after you've committed the patches to
 the trunk. :-)

The pack progress and two-phase pack patches have been committed.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-25 Thread Martijn Pieters
On Thu, Feb 24, 2011 at 16:56, Martijn Pieters m...@zopatista.com wrote:
 I see a lot of transaction aborted errors on the ZODB multi-thread
 tests with this patch in place, so I'll have to investigate more.
 Thread debugging joy!

In the end it was a simple mistake in the PostgreSQL version of the
commit lock method signature; a True instead of  False made the NOWAIT
variant the default. Oops!

The tests now pass against PostgreSQL. I haven't been able to get the
tests working against a MySQL 5.5 server on my laptop though (OS X
10.6), but the tests fail without the patch too. It looks like the
concurrent tests are not closing database connections or something. If
someone with a working MySQL test setup can report if the patch works,
that'd be much appreciated. See:

  
https://bitbucket.org/mjpieters/relstorage-mq/src/tip/nowait_commitlock_pack.patch

In the meantime, I'll bite the bullet and try this out on the production server!

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-24 Thread Martijn Pieters
On Wed, Feb 23, 2011 at 15:08, Martijn Pieters m...@zopatista.com wrote:
 I've started a optimistic locking strategy patch in my patch queue,
 contains this locking strategy change only for now:

  https://bitbucket.org/mjpieters/relstorage-mq/src/tip/optimistic_commitlock_pack.patch

I've made progress on the patch this afternoon. Next up are tests for
both patches. The above patch now uses the nowait locking strategy to
run pack batches. It has been renamed though, and now lives at:

  
https://bitbucket.org/mjpieters/relstorage-mq/src/tip/nowait_commitlock_pack.patch

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-24 Thread Martijn Pieters
On Wed, Feb 23, 2011 at 14:41, Martijn Pieters m...@zopatista.com wrote:
 I've moved this patch to bitbucket at
 https://bitbucket.org/mjpieters/relstorage-mq/src/tip/twophasepack.patch
 and updated the README a little more to document the options to
 zodbpack.

The two-phase pack patch has been updated again:

  https://bitbucket.org/mjpieters/relstorage-mq/src/tip/twophasepack.patch

I've now updated the tests, and ironed out a flaw the tests revealed.
I now consider this patch mergable into RelStorage.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-24 Thread Martijn Pieters
On Thu, Feb 24, 2011 at 14:26, Martijn Pieters m...@zopatista.com wrote:
 I've made progress on the patch this afternoon. Next up are tests for
 both patches. The above patch now uses the nowait locking strategy to
 run pack batches. It has been renamed though, and now lives at:

  https://bitbucket.org/mjpieters/relstorage-mq/src/tip/nowait_commitlock_pack.patch

I see a lot of transaction aborted errors on the ZODB multi-thread
tests with this patch in place, so I'll have to investigate more.
Thread debugging joy!

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-23 Thread Martijn Pieters
On Tue, Feb 22, 2011 at 22:51, Maurits van Rees
m.van.r...@zestsoftware.nl wrote:
 I wonder it it may help to set pack-gc to false during the first pack.
 According to the docs this is faster, though it of course leaves more
 unused objects behind.  Set pack-gc to the default true value for
 subsequent packs.  Theoretically this should make sure that the first
 pack will finish on time and leave an already smaller database; after
 the second pack the database is at its smallest.

That would be a good idea indeed.

Without GC enabled, it won't analyse object references. As a result,
the number of queries run during a non-GC pre-pack is far lower too,
so it's a lot easier on the database. Pre-pack becomes 5 queries,
essentially (3 inserts, 2 truncates), as opposed to 17 + batched
reference updates + batched pack state updates.

And with the 'easy' records cleared out with a non-GC run, presumably
queries during a GC run should be easier on the database as there is
less data to scan through.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-23 Thread Martijn Pieters
On Tue, Feb 22, 2011 at 21:41, Martijn Pieters m...@zopatista.com wrote:
 BTW, should I just commit the patch, or do you want to integrate it
 yourself?

Updated patch attached; added the options changes to component.xml and
README.txt.

-- 
Martijn Pieters


twophasepack.patch
Description: Binary data
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-23 Thread Martijn Pieters
On Wed, Feb 23, 2011 at 11:55, Martijn Pieters m...@zopatista.com wrote:
 Updated patch attached; added the options changes to component.xml and
 README.txt.

I've moved this patch to bitbucket at
https://bitbucket.org/mjpieters/relstorage-mq/src/tip/twophasepack.patch
and updated the README a little more to document the options to
zodbpack.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-23 Thread Martijn Pieters
On Tue, Feb 22, 2011 at 21:41, Martijn Pieters m...@zopatista.com wrote:
 I'll look into working the locking idea into a patch too,
 but I'll need help with supporting Postgres and MySQL as I don't know
 their locking semantics.

Both MySQL and Oracle support lock timeouts and already use a timeout
for the commit lock. Postgresql has a 'NOWAIT' parameter for locking.

All that is needed then is a nowait=False keyword parameter for the
hold_commit_lock method. For Oracle and MySQL nowait=True simply means
that the timeout is set to 0, for Postgresql NOWAIT is added to the
LOCK TABLE statement.

In all cases, when nowait is True, I guess the method should return a
boolean flag to indicate a successful lock, not throw an exception.
Exceptions would work too but a boolean would make more sense here.

I've started a optimistic locking strategy patch in my patch queue,
contains this locking strategy change only for now:

  
https://bitbucket.org/mjpieters/relstorage-mq/src/tip/optimistic_commitlock_pack.patch

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-22 Thread Martijn Pieters
Hi,

I was already investigating the possibility to split the RelStorage
packing process up into smaller chunks.

Due to the expected load on the Oracle cluster during a pack, we'll
have to run the pack at night and want to be absolutely certain that
database is ready for normal site operations again the next day. With
a 40+GB database (hasn't been packed for it's entire run, more than 2
years now) we are not confident packing will be done in one night.

To at least get a handle on how much work the packing is going to be,
and to have a nice stopping point, I looked at splitting pre-pack and
pack operations out into two separate steps. To my delight I saw that
the 1.5.0 beta already implements basically running only the pre-pack
phase (the --dry-run option). From there I created the attached patch,
one that renames the dry-run op into a 'prepack only' option, and adds
another option to skip the pre-pack and just use whatever is present
in the pack tables.

I haven't yet actually run this code, but the change isn't big. I
didn't find any relevant tests to update. Anyone want to venture some
feedback?

Helge Tesdal and I also looked into the pack operation itself, and how
it uses a duty cycle to give other transactions a chance to commit
during pack. We think there might be a better pattern to handle the
locking.

Currently, with the default values, the pack operation will hold the
commit lock for 5 seconds, pack, then release the lock for 5 more
seconds, repeating until done. With various options you can alter
these timings, but the basic principle is the same. For Oracle, where
the commit lock has a time-out, this means that packing can fail
because the commit lock times out. For all backends, Oracle or
otherwise, commits elsewhere on a site cluster will have to wait long
periods of time before they can proceed, leading to severe delays on a
heavily trafficked website.

With the variable time-out for requesting a commit lock on Oracle
however, there is a different option. I do not know if MySQL and
Postgres can support this too, I haven't looked into their lock
acquisition options, but the following relies on lock acquisition
timeouts.

Consider the following packing algorithm:

 * Use a short timeout (say 1 second) to request the commit lock.
 * If it doesn't time out:
* run one batch update cycle (up to 100 transactions processed).
* optionally clean out associated blobs
* unlock
* loop back up
 * If it does time out:
* commit lock is busy, so back off by sleeping a bit
* loop back up

By timing out the lock request quickly, you give commits from
non-packing zope transactions right of way. Packing truly becomes a
non-intrusive background operation. Is this a viable scenario?

-- 
Martijn Pieters


twophasepack.patch
Description: Binary data
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] RelStorage 1.5.0b1 dry-run two phase pack, better pack lock behaviour

2011-02-22 Thread Martijn Pieters
On Tue, Feb 22, 2011 at 19:12, Shane Hathaway sh...@hathawaymix.org wrote:
 Both ideas are excellent.  The new options even open the possibility of
 running the pre-pack on a copy of the database, then copying the pack
 tables to the main database for a final run.

For this project, we have a mirrorred test cluster high on our
wishlist, but having the database packed first would make that easier
given the size of the database now. In the future however, copying
back the prepared pack tables is an excellent prospect.

-- 
Martijn Pieters
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev