Re: Cyrus with a NFS storage. random DBERROR
On Sun, 10 Jun 2007, Rob Mueller wrote: I can try and keep an eye on bailouts some more, and see if I can get some more details. It would be nice if there was some more logging about why the bail out code path was actually called! It typically means that something deep in libimap has thrown an error. sync_client logs the only information that it has (the return code r). It probably wouldn't hurt to try and log the current mailbox/user in some consistent fashion. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
I suspect that the problem is with mailbox renames, which are not atomic and can take some time to complete with very large mailboxes. I think there's some other issues as well. For instance we still see skiplist seen state databases get corrupted every now and then. It seems certain corruption can result in the skiplist code calling abort() which terminates the sync_server, and causes the sync_client to bail out. I had a back trace on one of them the other day, but the stack frames were all wrong so it didn't seem that useful. HERMES_FAST_RENAME: Translates mailbox rename into filesystem rename() where possible. Useful because sync_client chdir()s into the working directory. Would be less useful in 2.3 with split metadata. It would still be nice to do this to make renames faster anyway. If you did. 1. Add new mailboxes to mailboxes.db 2. Filesystem rename 3. Remove old mailboxes You end up with a race condition, but it's far shorter than the mess you can end up with at the moment if a restart occurs during a rename. Together with my version of delayed expunge this pretty much guarantees that things aren't moving around under sync_client's feet. Its been an awful long time (about a year?) since I last had a sync_client bail out. We are moving to 2.3 over the summer (initially using my own original replication code), so this is something that I would like to sort out. Any suggestions? I can try and keep an eye on bailouts some more, and see if I can get some more details. It would be nice if there was some more logging about why the bail out code path was actually called! Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
does the IMAP spec specify how large a UUID can be? UUIDs aren't part of the IMAP spec. It's an addition to cyrus to help replication. By default, there's no way to access UUIDs via IMAP at all since they're not part of the IMAP spec. The UUID size chosen was done by David Carter when he implemented replication. It would be possible to change them again, but it would mean changing the cyrus.index file format and upgrading all cyrus.index files. Seemed easier not to do that. Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On Sat, 9 Jun 2007, Rob Mueller wrote: I run it directly, outside of master. That way when it crashes, it can be easily restarted. I have a script that checks that it's running, that the log file isn't too big, and that there are no log- PID files that are too old. If anything like that happens, it pages someone. Ditto, we do almost exactly the same thing. And for that matter, so I do. I think there's certain race conditions that still need ironing out, because rerunning sync_client on the same log file that caused a bail out usually succeeds the second time. I suspect that the problem is with mailbox renames, which are not atomic and can take some time to complete with very large mailboxes. sync_client retries a number of times and then bails out. if (folder_list-count) { int n = 0; do { sleep(n*2); /* XXX should this be longer? */ ... } while (r (++n SYNC_MAILBOX_RETRIES)); if (r) goto bail; } This was one of the most significant compromises that Ken had to make when integrating my code into 2.3. My original code cheats, courtesy of two other patches: HERMES_FAST_RENAME: Translates mailbox rename into filesystem rename() where possible. Useful because sync_client chdir()s into the working directory. Would be less useful in 2.3 with split metadata. HERMES_SYNC_SNAPSHOT: If mailbox action fails, promote to user action (no shared mailboxes) If user action fails then lock user out of the mboxlist and try again. Together with my version of delayed expunge this pretty much guarantees that things aren't moving around under sync_client's feet. Its been an awful long time (about a year?) since I last had a sync_client bail out. We are moving to 2.3 over the summer (initially using my own original replication code), so this is something that I would like to sort out. Any suggestions? -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Michael Menge wrote: Hi, after the problem with the wiki was solved, i added a summery about CyrusCluster http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/CyrusCluster . Could you, please, describe more detailed problems with replication: CyrusReplication: ... The replication is asynchrony so you might lose some mails. I test this functionality and doesn't find problem. If sync_client lost connection to sync_server (link down, firewalls drops tcp sessions, etc) I just run 'sync_client -u username' for fixing problem. It's enough. WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, i havent used the replication my selfe, so the information is only based on what i have read on this list. The sync_client discovers all changes on the mailboxes queues them and send them to the server. In case of a system crash ther might be changes that are still queued and not send to the server. Quoting Dmitriy Kirhlarov [EMAIL PROTECTED]: Michael Menge wrote: Hi, after the problem with the wiki was solved, i added a summery about CyrusCluster http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/CyrusCluster . Could you, please, describe more detailed problems with replication: CyrusReplication: ... The replication is asynchrony so you might lose some mails. I test this functionality and doesn't find problem. If sync_client lost connection to sync_server (link down, firewalls drops tcp sessions, etc) I just run 'sync_client -u username' for fixing problem. It's enough. WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html M.Menge Tel.: (49) 7071/29-70316 Universitaet Tuebingen Fax.: (49) 7071/29-5912 Zentrum fuer Datenverarbeitung mail: [EMAIL PROTECTED] Waechterstrasse 76 72074 Tuebingen smime.p7s Description: S/MIME krytographische Unterschrift Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Michael Menge wrote: Hi, i havent used the replication my selfe, so the information is only based on what i have read on this list. The sync_client discovers all changes on the mailboxes queues them and send them to the server. In case of a system crash ther might be changes that are still queued and not send to the server. It can be fixed with manualy running 'sync_client -f not_finished_logfile' option or 'sync_client -u user', if logfile is lose. Paul describe more interesting situation. I think will be good add little more details to twiki for this topic. WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, Dmitriy Kirhlarov wrote: Michael Menge wrote: after the problem with the wiki was solved, i added a summery about CyrusCluster http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/CyrusCluster . Could you, please, describe more detailed problems with replication: CyrusReplication: ... The replication is asynchrony so you might lose some mails. I test this functionality and doesn't find problem. If sync_client lost connection to sync_server (link down, firewalls drops tcp sessions, etc) I just run 'sync_client -u username' for fixing problem. It's enough. ... but asynchronous. It is possible that a message is delivered to your master, not yet (or not properly) synchronized to the backup, and the master fails (beyond repair). You can replace the master with the backup server, but you'll lose the unsynchronized messages (or other recent transactions). And when folders are not properly synchronized for some reason (that's why you need to check frequently with for instance make_md5*) you can lose more than just the most recent messages. Not necessarily a big problem, but just something to be aware of. It's not that your transaction is finished after the synchronization is done (would be slower, but more reliable): the synchronization is done asynchronously based on a log. While NFS (more reliable storage) might be an option for us now I figured out that it works with -o nolock or seperate metadata, I still think I'd add replication. It's very nice to have synchronization in the application, in a way that even allows for geographic separation. (And right now I'd trust it more than shared storage.) Paul * I'm still not fully using make_md5 myself. Still need to write a script, that walks through the files, and only compares the messages that are in both folders. If I run make_md5, it's never working on a folder on both servers at the same time, so there's always a gap. So the files aren't always equal, I think. Oh well. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Dmitriy Kirhlarov wrote: Michael Menge wrote: i havent used the replication my selfe, so the information is only based on what i have read on this list. The sync_client discovers all changes on the mailboxes queues them and send them to the server. In case of a system crash ther might be changes that are still queued and not send to the server. It can be fixed with manualy running 'sync_client -f not_finished_logfile' option or 'sync_client -u user', if logfile is lose. Which reminds me... isn't it strange that an unfinished logfile is removed when the cyrus master (or was it the sync_client -r) is restarted? Would make sense to me if the file is renamed / stored for later running through sync_client -f. (Or that sync_client -r reads this file too before it starts rolling.) Perhaps I'm wrong. Paul describe more interesting situation. I think will be good add little more details to twiki for this topic. Paul Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On Jun 8, 2007, at 11:36 AM, Paul Dekkers wrote: Dmitriy Kirhlarov wrote: Which reminds me... isn't it strange that an unfinished logfile is removed when the cyrus master (or was it the sync_client -r) is restarted? Would make sense to me if the file is renamed / stored for later running through sync_client -f. (Or that sync_client -r reads this file too before it starts rolling.) I agree. I think sync_client should process any pending log-* in the sync directory when it's later restarted. (Or at least have an option to do that.) Do people run sync_client in the SERVICES section rather than START? The install-replication docs indicate to put it in START. If my replica goes away for a little while, sync_client exits and then I have to restart it manually and then process any pending logs. Would be nice if it just started automatically and picked up where it left off. -nik Nik Conwell Boston University Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, list. Nik Conwell wrote: Do people run sync_client in the SERVICES section rather than START? The install-replication docs indicate to put it in START. If my replica goes away for a little while, sync_client exits and then I have to restart it manually and then process any pending logs. Would be nice if it just started automatically and picked up where it left off. It doesn't work with ldap ptloader: http://lists.andrew.cmu.edu/pipermail/cyrus-devel/2007-April/000293.html WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On 08 Jun 2007, at 06:52, Paul Dekkers wrote: * I'm still not fully using make_md5 myself. Still need to write a script, that walks through the files, and only compares the messages that are in both folders. If I run make_md5, it's never working on a folder on both servers at the same time, so there's always a gap. So the files aren't always equal, I think. Oh well. I don't have something to consume make_md5 data, yet, either. My plan is to note the difference between the replica and the primary. On a subsequent run, if those differences aren't gone, then they would be included in a report. :wes Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
I run it directly, outside of master. That way when it crashes, it can be easily restarted. I have a script that checks that it's running, that the log file isn't too big, and that there are no log- PID files that are too old. If anything like that happens, it pages someone. :wes On 08 Jun 2007, at 11:48, Nik Conwell wrote: Do people run sync_client in the SERVICES section rather than START? The install-replication docs indicate to put it in START. If my replica goes away for a little while, sync_client exits and then I have to restart it manually and then process any pending logs. Would be nice if it just started automatically and picked up where it left off. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
I don't have something to consume make_md5 data, yet, either. My plan is to note the difference between the replica and the primary. On a subsequent run, if those differences aren't gone, then they would be included in a report. Rather than make_md5, check the MD5 UUIDs patch below. Using this, we have a script that regularly checks both sides of a master/replica pair to check everything is consistent between the UUID and the computed MD5. It was this that let us discover the rare didn't unlink old files bug reported about 3 months back. --- http://cyrus.brong.fastmail.fm/ One problem we've had is the inability to easily check that the files on disk correspond to what was originally delivered to check for cyrus data corruption after either a disk problem or some other bug has caused us to be unsure of our data integrity. I wanted to calculate a digest and store it somewhere in the index file, but messing with the file format and fixing sync to still work, etc... it all sounded too painful. So - added is a new option uuidmode in imapd.conf. Set it to md5 and you will get UUIDs of the form: 02(first 11 bytes of the MD5 value for the message) which takes up the same space, but allows pretty good integrity checking. Is it safe? - we calulated that with one billion messages you have a one in 1 billion chance of a birthday collision (two random messages with the same UUID). They then have to get in the same MAILBOXES collection to sync_client to affect each other anyway. The namespace available for generated UUIDs is much smaller than this, since they have no collision risk - but if you had that many delivering you would hit the limits and start getting blank UUIDs anyway. Mitigating even the above risk: you could alter sync_client to not use UUID for copying. It's not like it's been working anyway (see our other UUID related patch). As an integrity check it's much more useful. The attached patch adds the md5 method, a random method which I've never tested and is almost certainly bogus, but is there for educational value[tm], the following FETCH responses in imapd: FETCH UUID = 24 character hex string (02 + first 11 bytes of MD5) FETCH RFC822.MD5 = 32 character hex string (16 bytes of MD5) FETCH RFC822.FILESIZE = size of actual file on disk (via stat or mmap) Totally non-standard of course, but way useful for our replication checking scripts. Embrace and extend 'r' us. Anyone feel like writing an RFC for fetching the digest of a message via IMAP? If the server calculated it on delivery and cached it then you'd have a great way to clean up after a UIDVALIDITY change or other destabilising event without having to fetch every message again. --- Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
I run it directly, outside of master. That way when it crashes, it can be easily restarted. I have a script that checks that it's running, that the log file isn't too big, and that there are no log- PID files that are too old. If anything like that happens, it pages someone. Ditto, we do almost exactly the same thing. Also if we switch master/replica roles, our code looks for any incomplete log files after stopping the master, and runs those first to ensure that replication was completely up to date. It seems anyone seriously using replication has to unfortunately do these things manually at the moment. Replication just isn't reliable enough, we see sync_client bail out quite regularly, and there's not enough logging to exactly pinpoint why each time. I think there's certain race conditions that still need ironing out, because rerunning sync_client on the same log file that caused a bail out usually succeeds the second time. It would be nice if some code was actually made part of the core cyrus distribution to make this all work properly, including switching master/replica roles. Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On Sat, 9 Jun 2007, Rob Mueller wrote: So - added is a new option uuidmode in imapd.conf. Set it to md5 and you will get UUIDs of the form: 02(first 11 bytes of the MD5 value for the message) which takes up the same space, but allows pretty good integrity checking. Is it safe? - we calulated that with one billion messages you have a one in 1 billion chance of a birthday collision (two random messages with the same UUID). They then have to get in the same MAILBOXES collection to sync_client to affect each other anyway. The namespace available for generated UUIDs is much smaller than this, since they have no collision risk - but if you had that many delivering you would hit the limits and start getting blank UUIDs anyway. does the IMAP spec specify how large a UUID can be? David Lang Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, after the problem with the wiki was solved, i added a summery about CyrusCluster http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/CyrusCluster . Please feel free to add infos about your experience with NFS Quoting Paul Dekkers [EMAIL PROTECTED]: Hi, Took me a while before I found the time to try the meta-partitions and NFS backed (data-)partitions, but: Dmitriy Kirhlarov wrote: On Thu, May 03, 2007 at 05:08:52PM +0200, Paul Dekkers wrote: I recently tried to use NFS (on a RedHat client, both to a NetApp filer as well as a RedHat NFS server) and I'll share my experiences: Michael Menge wrote: Cyrus has 2 problems with NFS. 1. Cyrus depends on filesystem locking. NFS-4 should have solved this problem but i have not tested it. 2. BerkleyDB uses shared Memory which does not work accros multiple servers. I used skiplist in the tests (default with Simon's RPM), and initially just used NFSv3 (and I also tested NFSv4): as long as I mounted with the -o nolock option it actually worked quite well (also on NFSv3). The performance was even better with the NetApp as target than with a local filesystem (and NFSv3 was faster than v4). The nolock options does not disable locking (as I understand it) for the filesystem, it just disables locking over NFS, so other nodes won't have the same file locked. (Correct me if I'm wrong.) My intention was not to have an active-active setup, so in that regard this might not be that bad. Not sure what other catches there are though. Are you try metapartition* options? If you don't need active-active setup it can be useful. I didn't try metapartitions with my -o nolock experiment (which of course doesn't work with active-active either), but now I did another experiment with regular NFS locking (no special mount-options) and a metapartition for every type of metadata (metapartition_files: header index cache expunge squat). I'm glad to say that this seems to work quite well! Similar to the -o nolock, actually, but it sounds more solid without the tweaking. We use NetApp as NFS filer, and it actually seems to perform a bit better than our (this) internal RAID5, load is similar, ... and fortunally no errors from the imaptest. It sounds like this could work. But I'm not sure about the Cyrus internals if there are any catches; Ken (or someone else), could this be considered safe? If it is safe, I'd prefer to use NFS because performance is similar (or better) and the filers are more reliable than our RAID5 setup. (I won't go into details, but it's basically a physically separated RAID-1 set of drives in RAID-6-ish.) (Performance-wise I only tried small folders, but as soon as the metadata is cached, I think there a not a lot of directory reads when a folder is opened, so that doesn't really matter... right?) I stressed the setup with the imaptest tool from Dovecot, I saw problems with that in the past (also with NFSv3 and v4, but in combination with Cyrus 2.2 and I'm not sure if I tried nolock), now it seemed to do just fine. Only NFSv4 does not seem to be the answer, it seems that -o nolock is (on Linux as client). I'm very hesitant to put this into production, I just wanted to do some more tests and ask others after that if they think this is wise or not... I couldn't find the time to do more tests... (like see how RedHat 5 behaves instead of RedHat 4, if the tric also works on FreeBSD, if I can make it fail one way or another... suggestions always welcome...) I still have to try how RedHat 5 and FreeBSD behave, Paul Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html M.Menge Tel.: (49) 7071/29-70316 Universitaet Tuebingen Fax.: (49) 7071/29-5912 Zentrum fuer Datenverarbeitung mail: [EMAIL PROTECTED] Waechterstrasse 76 72074 Tuebingen smime.p7s Description: S/MIME krytographische Unterschrift Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, Michael Menge wrote: after the problem with the wiki was solved, i added a summery about CyrusCluster http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/CyrusCluster . Please feel free to add infos about your experience with NFS Good suggestion. I will do some more testing, for now: I did some experiments on FreeBSD. I noticed that NFSv3 with -L (the -o nolock equivalent) works too for storing both metadata and data on NFSv3, and that separating metadata to a local (UFS) partition while mounting with normal NFS options for the data partition works too. (As with Linux. But a little bit faster actually, while storing both metadata and data on NFS with local locking seems to be a bit slower.) FreeBSD and NFSv4 is a no-go: I get a bunch of Fatal error: failed to mmap errors. And I guess this also indicates what makes NFS(v3) tricky: the mmaps have to work, not sure if this can be considered 100% safe. (Still reluctant to put this in production therefore, but perhaps there's nothing to fear.) For Linux the NFSv4 worked better, seperating metadata and storing the data on NFSv4 is fine, but putting the metadata on NFSv4 too doesn't work that well. Performance is bad, throughput decreases and things stall for a while eventually. I just checked with Fedora 7 too, similar results as with RHEL 4. So it sounds like Linux and FreeBSD share the same matrix for NFSv3, either local locking (no NFS) for all data, or NFS locking for the data and local (ext3/UFS in my case) metadata. NFSv4 doesn't really add much. Not sure what much testing I can do to assure that it is safe to put data on NFS while storing metadata on the local filesystem. It doesn't give an active-active setup, but there are certainly advantages in performance and reliability. (But perhaps GFS is still worth checking, if not just for the metadata ;-)) Paul Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, Took me a while before I found the time to try the meta-partitions and NFS backed (data-)partitions, but: Dmitriy Kirhlarov wrote: On Thu, May 03, 2007 at 05:08:52PM +0200, Paul Dekkers wrote: I recently tried to use NFS (on a RedHat client, both to a NetApp filer as well as a RedHat NFS server) and I'll share my experiences: Michael Menge wrote: Cyrus has 2 problems with NFS. 1. Cyrus depends on filesystem locking. NFS-4 should have solved this problem but i have not tested it. 2. BerkleyDB uses shared Memory which does not work accros multiple servers. I used skiplist in the tests (default with Simon's RPM), and initially just used NFSv3 (and I also tested NFSv4): as long as I mounted with the -o nolock option it actually worked quite well (also on NFSv3). The performance was even better with the NetApp as target than with a local filesystem (and NFSv3 was faster than v4). The nolock options does not disable locking (as I understand it) for the filesystem, it just disables locking over NFS, so other nodes won't have the same file locked. (Correct me if I'm wrong.) My intention was not to have an active-active setup, so in that regard this might not be that bad. Not sure what other catches there are though. Are you try metapartition* options? If you don't need active-active setup it can be useful. I didn't try metapartitions with my -o nolock experiment (which of course doesn't work with active-active either), but now I did another experiment with regular NFS locking (no special mount-options) and a metapartition for every type of metadata (metapartition_files: header index cache expunge squat). I'm glad to say that this seems to work quite well! Similar to the -o nolock, actually, but it sounds more solid without the tweaking. We use NetApp as NFS filer, and it actually seems to perform a bit better than our (this) internal RAID5, load is similar, ... and fortunally no errors from the imaptest. It sounds like this could work. But I'm not sure about the Cyrus internals if there are any catches; Ken (or someone else), could this be considered safe? If it is safe, I'd prefer to use NFS because performance is similar (or better) and the filers are more reliable than our RAID5 setup. (I won't go into details, but it's basically a physically separated RAID-1 set of drives in RAID-6-ish.) (Performance-wise I only tried small folders, but as soon as the metadata is cached, I think there a not a lot of directory reads when a folder is opened, so that doesn't really matter... right?) I stressed the setup with the imaptest tool from Dovecot, I saw problems with that in the past (also with NFSv3 and v4, but in combination with Cyrus 2.2 and I'm not sure if I tried nolock), now it seemed to do just fine. Only NFSv4 does not seem to be the answer, it seems that -o nolock is (on Linux as client). I'm very hesitant to put this into production, I just wanted to do some more tests and ask others after that if they think this is wise or not... I couldn't find the time to do more tests... (like see how RedHat 5 behaves instead of RedHat 4, if the tric also works on FreeBSD, if I can make it fail one way or another... suggestions always welcome...) I still have to try how RedHat 5 and FreeBSD behave, Paul Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On Thu, May 03, 2007 at 05:08:52PM +0200, Paul Dekkers wrote: I recently tried to use NFS (on a RedHat client, both to a NetApp filer as well as a RedHat NFS server) and I'll share my experiences: Michael Menge wrote: Cyrus has 2 problems with NFS. 1. Cyrus depends on filesystem locking. NFS-4 should have solved this problem but i have not tested it. 2. BerkleyDB uses shared Memory which does not work accros multiple servers. I used skiplist in the tests (default with Simon's RPM), and initially just used NFSv3 (and I also tested NFSv4): as long as I mounted with the -o nolock option it actually worked quite well (also on NFSv3). The performance was even better with the NetApp as target than with a local filesystem (and NFSv3 was faster than v4). The nolock options does not disable locking (as I understand it) for the filesystem, it just disables locking over NFS, so other nodes won't have the same file locked. (Correct me if I'm wrong.) My intention was not to have an active-active setup, so in that regard this might not be that bad. Not sure what other catches there are though. Are you try metapartition* options? If you don't need active-active setup it can be useful. I stressed the setup with the imaptest tool from Dovecot, I saw problems with that in the past (also with NFSv3 and v4, but in combination with Cyrus 2.2 and I'm not sure if I tried nolock), now it seemed to do just fine. Only NFSv4 does not seem to be the answer, it seems that -o nolock is (on Linux as client). I'm very hesitant to put this into production, I just wanted to do some more tests and ask others after that if they think this is wise or not... I couldn't find the time to do more tests... (like see how RedHat 5 behaves instead of RedHat 4, if the tric also works on FreeBSD, if I can make it fail one way or another... suggestions always welcome...) On FreeBSD you can use gmirror+ggated for mirroring disk partition between servers. WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Cyrus with a NFS storage. random DBERROR
I testing cyrus with a NFS storage by two *identical* cyrus + postfix servers Both /var/spool/imap and /var/imap are mounted by both servers ( the socket directory is moved out of the mount) Everything seems working fine but I find sometimes dupelim doesnt work. I tried to debug , I get errrors like these in my maillog DBERROR: skiplist recovery /var/imap/deliver.db: ADD at E2C8 exists What could be the reason ? I am using cyrus-imapd 2.3.7 on centos 4.4 on both servers , with a NetAPP box for storage Thanks Ram Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
Hi, Cyrus has 2 problems with NFS. 1. Cyrus depends on filesystem locking. NFS-4 should have solved this problem but i have not tested it. 2. BerkleyDB uses shared Memory which does not work accros multiple servers. You can try to convert all your databases to skipplist and run with NFS-4 and proper filesystemlocking. But as far as i know only skiplist and GFS is tested in a setup like yours. Have a look at the threads on Cyrus and Cluster Filesystem and Cyrus HA. Some time ago i wrote a summery of the infos in these threads, and wanted to add it to the Wiki, but had no sucsess so far. Regards Michael Quoting ram [EMAIL PROTECTED]: I testing cyrus with a NFS storage by two *identical* cyrus + postfix servers Both /var/spool/imap and /var/imap are mounted by both servers ( the socket directory is moved out of the mount) Everything seems working fine but I find sometimes dupelim doesnt work. I tried to debug , I get errrors like these in my maillog DBERROR: skiplist recovery /var/imap/deliver.db: ADD at E2C8 exists What could be the reason ? I am using cyrus-imapd 2.3.7 on centos 4.4 on both servers , with a NetAPP box for storage Thanks Ram Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html M.Menge Tel.: (49) 7071/29-70316 Universitaet Tuebingen Fax.: (49) 7071/29-5912 Zentrum fuer Datenverarbeitung mail: [EMAIL PROTECTED] Waechterstrasse 76 72074 Tuebingen smime.p7s Description: S/MIME krytographische Unterschrift Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html