RE: [Zope3-Users] zodb objects backup
> > It would be very nice if there were a faq or other doc addressing all of > > these scalability-related questions systematically for someone like me > > who understands what a relational database is doing for them (in return > > for squashing all their objects). It seems I'm not the only one with > > these concerns. :) > > Welcome to Open Source, we look forward to seeing your newly-written faq > as it becomes available online ;-) Well, I've got enough info to write one by now... only, not having reviewed or tested the code, I couldn't actually vouch for it. But seriously, coming to Zope for an app server, I had no idea that ZODB was as robust as it seems to be. I think, from a marketing standpoint for the Zope corp (and other in the community deriving sustenance from providing zope-related products and services), it would make a lot of sense if they at least made the claim in a prominent place that ZODB was a serious candidate to replace a RDBMS if you didn't absolutely need SQL. Collecting the info together could be put on the "todo" list, but motivating people to seek it out would be a good first step. > > In the meantime, I had an idea about my current implementation: maybe > > instead of __getstate__ and __setstate__ I should put the external data > > in _v_data (marking as "volatile"). Then I could trap for its existence, > > and load if necessary; and also have an explicit "refresh" wired to a > > button in the GUI. > > Yup, this is exactly what _v_ was designed for... > Great. - Shaun ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
Hi Shaun, Shaun Cutts wrote: Ah -- very nice: so Data.fs *is* a transaction log. Yup, and a very simple, robust one at that ;-) In theory an RDBMS with write ahead logging is still more secure because the transaction log is only backup, and the rest of the database is another copy of the current state (though not with undo capability). Well, if you worry about this, fork out for ZRS, which writes all your transactions to multiple back end storage servers... But with replication, this issue is taken care of. (Too bad replication isn't part of the core functionality) You can always do application-level replication ;-) Section 3.1... so ZODB is effectively doing MVCCS and with per-object locks to resolve conflicts. Someone else explained how Zope 2.8+ is now even better with this :-) (Question: can one explicitly lock an object without changing it? I guess just setting _p_changed?) No, 'cos that would also mean that a copy of the object got added to the end of Data.fs when the current transaction is committed, which is done by Zope's publisher... 3) general lack of query language potentially problematic for datamining catalogs are your friends... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
Shaun Cutts wrote: So far so good, modulo the replication issue. (We don't have much funding yet... but if initial launch goes well, we're hopeful :)). As people have mentioned, you can use repozo to get almost the same effect. Either that or do app-level replication ;-) It would be very nice if there were a faq or other doc addressing all of these scalability-related questions systematically for someone like me who understands what a relational database is doing for them (in return for squashing all their objects). It seems I'm not the only one with these concerns. :) Welcome to Open Source, we look forward to seeing your newly-written faq as it becomes available online ;-) In the meantime, I had an idea about my current implementation: maybe instead of __getstate__ and __setstate__ I should put the external data in _v_data (marking as "volatile"). Then I could trap for its existence, and load if necessary; and also have an explicit "refresh" wired to a button in the GUI. Yup, this is exactly what _v_ was designed for... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
RE: [Zope3-Users] zodb objects backup
Gary, Thanks again for your patient answers to all my questions. I'm trying to think of reasons why ZODB would not work in advance (as ZODB master/postgres slave), as this is a fundamental decision which would take much too much time to undo, and scalability testing is also time consuming. So far so good, modulo the replication issue. (We don't have much funding yet... but if initial launch goes well, we're hopeful :)). It would be very nice if there were a faq or other doc addressing all of these scalability-related questions systematically for someone like me who understands what a relational database is doing for them (in return for squashing all their objects). It seems I'm not the only one with these concerns. :) WRT benchmarks, there are several standard benchmarks for SQL databases. If one of these had been ported to ZODB, even if it wasn't an exact fit, it would be quite a relief to know, starting out, that "well, I don't know how it will do on my application, but I know that it probably is within a factor of 2 or 3 of Postgres performance on something vaguely similar, so if I do have problems, I should be able to roll up my sleeves and fix them with suitable optimization or in the worst case buying new hardware, but not that much more than I'd need with Postgres." ... In the meantime, I had an idea about my current implementation: maybe instead of __getstate__ and __setstate__ I should put the external data in _v_data (marking as "volatile"). Then I could trap for its existence, and load if necessary; and also have an explicit "refresh" wired to a button in the GUI. (Later, if Postgres was still the master, I could write a db trigger to set per-table update timestamp, accessible without doing a full requery.) - Shaun ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
On Feb 26, 2006, at 4:33 PM, Shaun Cutts wrote: Thanks Gary! Ah -- very nice: so Data.fs *is* a transaction log. In theory an RDBMS with write ahead logging is still more secure because the transaction log is only backup, and the rest of the database is another copy of the current state (though not with undo capability). But with replication, this issue is taken care of. (Too bad replication isn't part of the core functionality) Some people use ZRS, some people have strategies with DirectoryStorage, some people use repozo as described in the first link I sent, some are exploring other options like PGStorage. Also nice is http://www.python.org/workshops/2000-01/proceedings/papers/fulton/ zodb3. html#pgfId=294502 Section 3.1... so ZODB is effectively doing MVCCS and with per-object locks to resolve conflicts. That paper is old: the ZODB is doing MVCC now with full views of the database at the time of transaction start. There's a doc in the wiki describing it. (Question: can one explicitly lock an object without changing it? I guess just setting _p_changed?) That will mark it as changed whether or not it was, yes. I'm pretty sure (but notice caveat) that this will "dirty" the object, as far as write conflicts are concerned, whether or not the object actually changed. Are there any benchmarks available? I believe there is a ZODB bench somewhere. I don't know much about it. We can't abandon Postgres entirely: 1) we have custom aggregate statistical functions in C 2) we have to allow third-party ODBC access to certain views 3) general lack of query language potentially problematic for datamining Especially for third parties (non-Zope/ZODB experts). Two "howevers": first, I'm led to believe that datamining generally happens externally from apps anyway, so the Postgres slave idea (that you have below) would work quite well. Second, even in a ZODB app, as with an SQL app, if you know where the data resides, know the available indexes, know how to build and populate new indexes, and know how to intersect and union results, you can do just about anything you want. The difference is just that "everyone" knows SQL spelling for that stuff, and much fewer know the nitty-gritty of spelling that with Zope 3 indexes. But 1)-2)-3) for us are "read-only" needs, so in theory, with replication, we could use Postgres as a slave to ZODB master. Yes, I've considered an architecture like that recently myself for some projects. Other approaches are to use the Zope database adapters (which handle the transaction machinery), and then write simple wrappers that produce throw-away, non-persistent objects that persist the data in Postgres. Another would be to monetarily support someone like Shane to see if a solution like the PG storage will help. I would not encourage (and, perhaps too gently, have not encouraged) someone without either a lot of ZODB knowledge or a lot of time and energy to become a very deep ZODB expert to pursue the __getstate__ __setstate__ approach you showed. It's an interesting idea, but you are really bypassing huge chunks of the ZODB machinery, probably to your loss. Much safer to deal with the Zope DBA stuff (persistent data in transient objects) or, if you are an expert or want to be one, with an approach like Shane's. Again, benchmarks would be nice. We haven't yet speced out, let alone bought, the hardware for our production system, so I couldn't yet say how high the bar is. I don't have these, and I'm not even sure exactly what you want. Gary ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
On Feb 27, 2006, at 12:03 AM, Shaun Cutts wrote: From: Tom Dossis [mailto:[EMAIL PROTECTED] Shaun Cutts wrote: But with replication, this issue is taken care of. (Too bad replication isn't part of the core functionality) Maybe soon.. http://hathawaymix.org/Software/PGStorage Great! Another potential issue on the scalability front I had forgotten about earlier: How do BTrees perform under lots of concurrent updates? (I know this is a tricky one, as implementations can get pretty complex to deal with this.) You've got a lot of questions. :-) I only have time to be quick, and hope that it is somewhat helpful. In that vein: Pretty well. It has custom code to try to resolve conflicts. They are designed to handle it pretty well. That said, even with the special resolution code, catalogs (heavy BTree users) are often hotspots for conflicts, and sometimes get special treatment to serialize updates. I note that http://www.zope.org/Wikis/ZODB/FrontPage/guide/ node6.html#SECTION0006300 00 Says that "As with a Python dictionary or list, you should not mutate a BTree-based data structure while iterating over it". Does this apply only to thread-local modifications or to any modification by anyone else? thread-local (or more correctly, connection-local) Ie, are BTrees "versioned" as the ZODB is... if I'm iterating over a BTree in my process (in ZEO, say), and another process modifies the BTree, does that sometimes show up in my copy, or only after commit? Only after commit. Also, wrt "ConflictError" -- is the BTree considered one object, or are the python objects (buckets, tree structure, ...) treated separately? separately. In general, are the BTrees just written "naively" on top of ZODB, or do they interact in some special way with the storage? Hm. They are certainly not "naive" of the ZODB; they authors have deep knowledge of the ZODB, and take advantage of ZODB hooks (such as the conflict resolution) and behavior. Gary ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
RE: [Zope3-Users] zodb objects backup
> From: Tom Dossis [mailto:[EMAIL PROTECTED] > Shaun Cutts wrote: > > But with replication, this issue is taken care of. (Too bad replication > > isn't part of the core functionality) > > Maybe soon.. >http://hathawaymix.org/Software/PGStorage > Great! Another potential issue on the scalability front I had forgotten about earlier: How do BTrees perform under lots of concurrent updates? (I know this is a tricky one, as implementations can get pretty complex to deal with this.) I note that http://www.zope.org/Wikis/ZODB/FrontPage/guide/node6.html#SECTION0006300 00 Says that "As with a Python dictionary or list, you should not mutate a BTree-based data structure while iterating over it". Does this apply only to thread-local modifications or to any modification by anyone else? Ie, are BTrees "versioned" as the ZODB is... if I'm iterating over a BTree in my process (in ZEO, say), and another process modifies the BTree, does that sometimes show up in my copy, or only after commit? Also, wrt "ConflictError" -- is the BTree considered one object, or are the python objects (buckets, tree structure, ...) treated separately? In general, are the BTrees just written "naively" on top of ZODB, or do they interact in some special way with the storage? If they are just sitting on top of ZODB, it would seem that this dims the possibility that one could use the ZODB as a Postgres replacement. - Shaun ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
Shaun Cutts wrote: But with replication, this issue is taken care of. (Too bad replication isn't part of the core functionality) Maybe soon.. http://hathawaymix.org/Software/PGStorage ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
RE: [Zope3-Users] zodb objects backup
Thanks Gary! Ah -- very nice: so Data.fs *is* a transaction log. In theory an RDBMS with write ahead logging is still more secure because the transaction log is only backup, and the rest of the database is another copy of the current state (though not with undo capability). But with replication, this issue is taken care of. (Too bad replication isn't part of the core functionality) Also nice is http://www.python.org/workshops/2000-01/proceedings/papers/fulton/zodb3. html#pgfId=294502 Section 3.1... so ZODB is effectively doing MVCCS and with per-object locks to resolve conflicts. (Question: can one explicitly lock an object without changing it? I guess just setting _p_changed?) Are there any benchmarks available? We can't abandon Postgres entirely: 1) we have custom aggregate statistical functions in C 2) we have to allow third-party ODBC access to certain views 3) general lack of query language potentially problematic for datamining But 1)-2)-3) for us are "read-only" needs, so in theory, with replication, we could use Postgres as a slave to ZODB master. Again, benchmarks would be nice. We haven't yet speced out, let alone bought, the hardware for our production system, so I couldn't yet say how high the bar is. - Shaun > -Original Message- > From: Gary Poster [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 25, 2006 3:39 PM > To: Shaun Cutts > Cc: Alen Stanisic; zope3-users@zope.org > Subject: Re: [Zope3-Users] zodb objects backup > > Alen, please see > > http://www.zope.org/Wikis/ZODB/FileStorageBackup > > Shaun, many of the other questions in this thread--and others > recently--are answered in this guide: > > http://www.zope.org/Wikis/ZODB/FrontPage/guide/index.html > > It is highly recommended reading if you are doing serious Zope 3 apps. > > Both of these are found in the ZODB wiki, which has some other > helpful docs: > > http://www.zope.org/Wikis/ZODB/FrontPage > > Gary ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
Thanks Gary. On Sat, 2006-02-25 at 15:39 -0500, Gary Poster wrote: > Alen, please see > > http://www.zope.org/Wikis/ZODB/FileStorageBackup > > Shaun, many of the other questions in this thread--and others > recently--are answered in this guide: > > http://www.zope.org/Wikis/ZODB/FrontPage/guide/index.html > > It is highly recommended reading if you are doing serious Zope 3 apps. > > Both of these are found in the ZODB wiki, which has some other > helpful docs: > > http://www.zope.org/Wikis/ZODB/FrontPage > > Gary ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
Alen, please see http://www.zope.org/Wikis/ZODB/FileStorageBackup Shaun, many of the other questions in this thread--and others recently--are answered in this guide: http://www.zope.org/Wikis/ZODB/FrontPage/guide/index.html It is highly recommended reading if you are doing serious Zope 3 apps. Both of these are found in the ZODB wiki, which has some other helpful docs: http://www.zope.org/Wikis/ZODB/FrontPage Gary ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
RE: [Zope3-Users] zodb objects backup
> On Sat, 2006-02-25 at 08:59 -0600, Andreas Jung wrote: > > > > --On 26. Februar 2006 01:52:29 +1100 Alen Stanisic > > <[EMAIL PROTECTED]> wrote: > > > > > > > > For some reason it doesn't feel completely safe just relying on > Data.fs. > > > > That means what? Why shouln't it be safe...please come up with some > > reasonable arguments.. > > > > -aj > > > > I did mention that it could be because most rdb systems have a database > and also keep transaction logs. In case of a failure you put the latest > backup of the db and transaction logs together and you could rebuild > your db to the point just before the failure. If you only had a daily > back up of your db you could potentially lose a full day of > transactions. > One could also mention failover: if one computer with Data.fs goes down, you're down, period; whereas many RDBMSes support keeping slave copies of the database, which are then available. Also, there are ACID transactions (see http://en.wikipedia.org/wiki/ACID) in a good RDBMS. I don't know how zodb ensures consistency if there are multiple concurrent users, and can we rollback gracefully? ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
On Sat, 2006-02-25 at 08:59 -0600, Andreas Jung wrote: > > --On 26. Februar 2006 01:52:29 +1100 Alen Stanisic > <[EMAIL PROTECTED]> wrote: > > > > > For some reason it doesn't feel completely safe just relying on Data.fs. > > That means what? Why shouln't it be safe...please come up with some > reasonable arguments.. > > -aj > I did mention that it could be because most rdb systems have a database and also keep transaction logs. In case of a failure you put the latest backup of the db and transaction logs together and you could rebuild your db to the point just before the failure. If you only had a daily back up of your db you could potentially lose a full day of transactions. Alen ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
Alen Stanisic wrote: > On Sat, 2006-02-25 at 07:52 -0600, Andreas Jung wrote: >> --On 26. Februar 2006 00:04:39 +1100 Alen Stanisic >> <[EMAIL PROTECTED]> wrote: >> >>> Hello, >>> >>> what would be the best way of taking a backup of persistent objects >>> inside Data.fs with possibility to rebuild it on a fresh Zope 3 install >>> in case of a disaster recovery lets say. >>> >> Just backup the Data.fs file. >> > > For some reason it doesn't feel completely safe just relying on Data.fs. > Maybe I am thinking too much in rdb land and transaction logging where > you could rebuild your db from the logs. > > Alen I once felt like this... But I've learned to stop worrying and love the ZODB :-) I can recommend looking into the very useful repozo.py which ships as part of ZODB tools - at least in all the Zope-2 series (afaik). Its invaluable for making incremental backups of your Data.fs as it grows. (There are some issues about copying the Data.fs out of a running zope instance...) I don't have a production application in deployed with Zope3 (yet), but a quick 'tree' informs me that repozo.py isn't shipped with Zope-3.2. Can anyone tell me why not? Cheers Rupert ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
--On 26. Februar 2006 01:52:29 +1100 Alen Stanisic <[EMAIL PROTECTED]> wrote: For some reason it doesn't feel completely safe just relying on Data.fs. That means what? Why shouln't it be safe...please come up with some reasonable arguments.. -aj pgpr7e7QMw0eC.pgp Description: PGP signature ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
On Sat, 2006-02-25 at 07:52 -0600, Andreas Jung wrote: > > --On 26. Februar 2006 00:04:39 +1100 Alen Stanisic > <[EMAIL PROTECTED]> wrote: > > > Hello, > > > > what would be the best way of taking a backup of persistent objects > > inside Data.fs with possibility to rebuild it on a fresh Zope 3 install > > in case of a disaster recovery lets say. > > > > Just backup the Data.fs file. > For some reason it doesn't feel completely safe just relying on Data.fs. Maybe I am thinking too much in rdb land and transaction logging where you could rebuild your db from the logs. Alen ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users
Re: [Zope3-Users] zodb objects backup
--On 26. Februar 2006 00:04:39 +1100 Alen Stanisic <[EMAIL PROTECTED]> wrote: Hello, what would be the best way of taking a backup of persistent objects inside Data.fs with possibility to rebuild it on a fresh Zope 3 install in case of a disaster recovery lets say. Just backup the Data.fs file. -aj pgppCQEp73WCn.pgp Description: PGP signature ___ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users