Re: testsuite run for 2.4.19 [WAS: [PATCH v4] imapd.c: imapoptions: implement idle timeout]
Hi ellie, the testsuite run was all good for the upcoming 2.4.19. It tested folder creation, LMTP, POP3, message create/move/delete and the ANNOTATEMORE extension for "Kolab folder types", including access via horde / horde's IMAP library. -> green light from me (if that counts anything). Cheers, Thomas
Re: [PATCH v4] imapd.c: imapoptions: implement idle timeout
Hi ellie, On Monday, 15 May 2017 08:32:30 CEST ellie timoney wrote: > Planning to cut a release straight from the 2.4 git HEAD at time of > writing (bfc80b3). The release tarball is already built and uploaded at > https://cyrusimap.org/releases/cyrus-imapd-2.4.19.tar.gz > > If you could give that a run through your test suite, that'd be amazing. > If it works okay, I'll push the tag upstream, send out the formal > announcement, and we're done. There's a couple of patches that weren't > there last week, but mostly it's the same as what's been there for ages. first quick test ran fine. IMAP idle timeout patch was also identical with what has landed in the git repo. I've now started the full testsuite which takes around 12 hours to finish (tests a lot of stuff from cyrus to IPSec, horde + php etc). Hopefully the tests will run fine since I'm not sure I'll have an Internet connection at home to check on the progress. Switched providers today... Cheers, Thomas
Re: [PATCH v4] imapd.c: imapoptions: implement idle timeout
Hi Ellie, On Tuesday, 09 May 2017 03:15:07 CEST ellie timoney wrote: > I'm looking at doing a new release of 2.4 soon -- there's been a number > of issues coming up lately which are already fixed in git, but aren't in > a released version. It might save some support effort if these fixes > were released -- though it is a somewhat substantial change, what with > the refactored idled. > > Are you still running 2.4 there? Do you have any other local > patches/backports that have been necessary to get the refactored idled > stable? Would appreciate those too, if so :) yes, still running 2.4. So far no other patches needed, the idled stuff has been stable since September 2016. We have an automated testsuite for our whole distribution, so once there's a new 2.4 release candidate, we could try to plunge it in and give it a go. Do you want to cut a release straight from 2.4 git HEAD? Or apply some other patches first? Cheers, Thomas
Re: Re: [PATCH v4] imapd.c: imapoptions: implement idle timeout
Hi Ellie, On Tuesday, 4. October 2016 11:02:39 ellie timoney wrote: > This version of the patch is now on master. Thanks very much for > putting it together :) nice :) Thanks. > I've opened https://github.com/cyrusimap/cyrus-imapd/issues/42 for > backporting it to 2.5 -- expecting to do it soon, but making sure it > doesn't get forgotten if I get distracted! we also have a version of this patch for cyrus 2.4 in production now. If you're interested, we can upstream it, too. Cheers, Thomas
Re: Re: [RFC PATCH v2] imapd.c: imapoptions: implement idle timeout
Hi Ellie, On Monday, 12. September 2016 11:35:45 ellie timoney wrote: [clock jumps] > Or does it? The man page says it's "not affected by discontinuous > jumps in the system time (e.g., if the system administrator manually > changes the clock)" -- great -- "but is affected by the incremental > adjustments performed by adjtime(3) and NTP". Which sounds to me like > NTP might still be an issue? (But: I have no real world experience of > this, I'm just reading man pages here.) Good point. Not sure here, we didn't encounter an issue for a long time. The event itself is rather rare these days. > > Would it make sense to enable the timeout by default? > > In the current version of the patch it's disabled (value 0). > > I'm interested in hearing thoughts on this, particularly with regard to > what a reasonable default timeout might be. Though I like the "no > behaviour change unless you change configuration" aspect of defaulting > to 0. We'll push out the three days default value next week. I can report back in a month how good or bad the results are. Cheers, Thomas
Re: Re: [RFC PATCH v2] imapd.c: imapoptions: implement idle timeout
Hi Ellie, On Thursday, 8. September 2016 11:07:54 ellie timoney via Cyrus-devel wrote: > I can kind of see a point about the ability to request a monotonic > clock, such that some sysadmin changing the system time doesn't > prematurely timeout a bunch of idle connections. Though that doesn't > seem likely to be a real problem in practice (a: don't do that, b: the > clients will just reconnect in a bit anyway). time jumps issued via NTP are a recurring source of trouble. On our distribution we have detection logic for that and restart quite a few services if it happens. The monotonic clock avoids that nicely. > Anyway, the patch applies cleanly, builds fine for me, and passes our > current tests. I'll try and put together a couple of Cassandane tests > for the new behaviour and see what shakes out of that. Would it make sense to enable the timeout by default? In the current version of the patch it's disabled (value 0). We plan on rolling it out with a default value of three days for our systems. That's enough to keep workstations that are left on over the weekend happy, but short enough to clean up "forgotten" clients idling on the INBOX/Drafts folder. There are still quite a few Outlook 2007 installations with broken IMAP IDLE handling in the wild, so the RFC recommended timeout of 29 minutes is unsuitable. Cheers, Thomas
Re: Re: Timeout for IMAP idle
On Thursday, 28. July 2016 17:16:03 Carlos Velasco via Cyrus-devel wrote: > Looking into idled.c from v2.5.8 there is a timeout: > > /* Set inactivity timer (convert from minutes to seconds) */ > idle_timeout = config_getint(IMAPOPT_TIMEOUT); > if (idle_timeout < 30) idle_timeout = 30; > idle_timeout *= 60; > ... > case IDLE_MSG_NOTIFY: > ... > /* send a message to all clients idling on mboxname */ > t = (struct ientry *) hash_lookup(msg->mboxname, ); > for ( ; t ; t = n) { > n = t->next; > if ((t->itime + idle_timeout) < time(NULL)) { > /* This process has been idling for longer than the > timeout * period, so it probably died. Remove it from the list. */ > if (verbose || debugmode) > syslog(LOG_DEBUG, "TIMEOUT %s\n", > idle_id_from_addr(>remote)); > > remove_ientry(msg->mboxname, >remote); yes, we run idled. I think that might be the problem here, the expire of idle clients is done only if there is activity in the folder. The clients that were idling for two weeks were doing it on the INBOX/Drafts folder. So that's probably an area that needs improvement in idled. @Bron: What about a "global" timeout check in the main loop of idled ever XX minutes? Cheers, Thomas
Re: Re: Timeout for IMAP idle
On Thursday, 28. July 2016 16:48:51 Sebastian Hagedorn wrote: > According to the RFC, clients need to "renew" the IDLE state at least > every 30 minutes. I would assume that the clients you see are doing that. > If not, you might see a different issue. I tested it a few days ago: Connected to imapd on a unix socket and started the IDLE command. -> the session was still running today :) There is no server side timeout in cyrus imapd. Cheers, Thomas
Timeout for IMAP idle
Hello, right now cyrus imapd (2.4 / 2.5) doesn't have a timeout for connections in the "IDLE" state. One mail server here is serving a lot of Android K9 mail clients that make heavy use of IMAP idle. So far around 2300 imapd processes piled up even though it's just about 100 users in total. Some of those connections are in the IDLE state for two weeks now. What about adding a configurable timeout for IMAP idle? There were bugs in the past concerning Outlook Express and at least Outlook 2007 showing an offline status once the connection was closed on the server side. Various IMAP servers had quirks for this. Still an idle timeout of let's say 48h might make sense. Cheers, Thomas
Re: Cyrus database and file usage data
On Monday, 13. June 2016 00:05:13 Bron Gondwana via Cyrus-devel wrote: > If twoskip is too slow (possible), then I've been quite interested in > looking at rocksdb (http://rocksdb.org/) as an embedded engine that has > really good performance, prefix scanning, and a good community around it. > It's also quite compatible with object storage because all but the > level0 "hot" databases are read-only, so you can store them as objects > once and then not need to scan them again. rocksdb sounds like an interesting piece of technology. Secondary indexes look a bit tricky though. Would the new cyrus databases benefit from secondary indexes at some level? Good read: https://voltdb.com/blog/foundationdbs-lesson-fast-key-value-store-not-enough and: https://eng.uber.com/schemaless-part-one/ > An alternative there is multi-level databases in the same way we have the > search tiers - with offline repack and substituting a new database with > identical contents (minus dead records) atomically in the way that we do > it with search. This eliminates the stop-the-world repacks that > occasionally hit us with both cyrus.index/cyrus.cache and all the > twoskip/skiplist databases, because repack can be done in the background > to new read-only files, with all writes happening to a small level0 > database. I absolutely hate it when a routine git command suddenly takes ten minutes to complete because it's doing a large repack on the server side. -> offline repack is a great idea > It is a massive change to the on-disk data formats! We'd be left with > basically: > > * key value stores > * cache format (multiple fixed-length binary items per file with file > number + offset addressing) * rfc822 messages (either stick with > one-file-per-message or do some MIX style multiple-per-file - this can be > independent) another thing to consider: Data corruption. How will the new code / databases deal with different levels of data corruption? One of the complains that I read about cyrus vs. dovecot is that dovecot is self-healing when it's discovering a corrupted mailbox while cyrus-imapd requires an admin to run reconstruct & friends. I still need to run reconstruct every six months or so when a power outage corrupted a db file. May be some ideas can also be found here: http://wiki.dovecot.org/Design/Indexes http://wiki2.dovecot.org/Design/Indexes/TransactionLog http://wiki.dovecot.org/IndexFiles http://wiki2.dovecot.org/MailboxFormat/dbox Personally I've never seen the idea of a "lockless integer" before, but it looks like a neat trick. Cheers, Thomas
v3.0: IMAP hierarchy flexibility
Hi Bron, given the switch to UUIDs for the folder structure in the backend of v3.0*, I'm wondering if the following folder structure can be supported: - Inbox/folder_below_inbox - folder_next_to_inbox - user/smith/other_shared_folder Background info: Some users want to create folders below inbox and folders next to inbox. "altnamespace" is a workaround in that direction. Microsoft Exchange and Dovecot support this feature and that's why management keeps nagging me. Cheers, Thomas *) no more getting paged at 3am in the morning because of folder renames.
Re: objectstore: don't build unless asked for
On Friday, 6. November 2015 18:44:52 ellie timoney via Cyrus-devel wrote: > I think the main questions at this point are: > > * are we happy with the combination of --enable-objectstore to turn the > feature on, plus --with-[backend] to select a backend, and default to > dummy if no other backend was selected? what's the plus side of having the dummy backend if you do --enable-objectstore? To me it sounds like it was added as compat layer if no real objectstore is available. Since the objectstore will be optional, is the dummy backend really needed? Cheers, Thomas
Re: Object Storage alpha lands on master
On Tuesday, 13. October 2015 23:03:23 Bron Gondwana wrote: > I'm almost positive that reconstruct won't work correctly with object > storage turned on right now. ...which reminds me that reconstruct is still broken in the 2.4 branch since only half of my patch was applied. The gory details: http://lists.andrew.cmu.edu/pipermail/cyrus-devel/2015-August/003395.html Cheers, Thomas
Re: Re: [PATCH master] idled: Add missing signals_poll() call
On Tuesday, 29. September 2015 08:35:34 ellie timoney wrote: > On Mon, Sep 28, 2015, at 11:28 PM, Thomas Jarosch wrote: > > On Monday, 28. September 2015 11:56:51 ellie timoney wrote: > > > Hi Thomas, > > > > > > This is now on master. Thanks for the patch! :) > > > > > > ellie > > > > thanks Ellie. Will the 2.5 backport I sent also be merged? > > > > Cheers, > > Thomas > > Do you mean the 2.4 backport? That's my intention, but I haven't gotten > to it yet no hurry with the backport. I deployed the 2.4 backport on a production server with 850+ users today. So far everything is smooth *fingers crossed*. I was thinking about this one: http://asg.andrew.cmu.edu/archive/message.php?mailbox=archive.cyrus-devel=4650 -> it's the same patch as for master, just with tabs for 2.5. Thomas
[2.4 pull request] idle / idled fixes backport to 2.4
The following changes since commit e302cae2f208741c98fa73e6663963c9343ed744: mailbox: fix lock management over rename (2015-07-20 03:27:51 +1000) are available in the git repository at: https://github.com/thomasjfox/cyrus-imapd.git imap-idle-fixes-2.4-backport for you to fetch changes up to fcf7797003ae4c820c255828be2f89b52ca9953e: idled: Add missing signals_poll() call (2015-09-25 10:23:42 +0200) Additional note: The source code differences between 2.4 and 2.5 on the files imap/idle*. are pretty minimal now. The new code is running on two production level servers with idled enabled. More testing welcome. Bron Gondwana (7): idle: shut down on SIGQUIT idle: don't lose socket after first use idle: return 0 if idle not enabled idled: fix broken message formatting idle: don't fork if running as a daemon idle: use NOWAIT if possible idle: don't access mailboxes.db directly, use allmbox Greg Banks (23): idled: better signal handling idle: whitespace cleanup idle: rename messages IDLE_FOO -> IDLE_MSG_FOO idle: remove "freelist" ientry memory management idle: some const correctness idle: better function names in idled idle: rename idle_data_t -> idle_message_t idle: move message send/recv into a common file idle: centralise AF_UNIX socket creation idle: rename notify_sock -> idle_sock idle: unlink local AF_UNIX socket when done idle: don't use signals, use AF_UNIX dgrams idled: use library function shutdown_file() idle: don't do FD_ISSET(-1,) it's unhelpful idled: fork after creating socket [IRIS-1831] Improve logging in idle code. [IRIS-1831] don't syslog on ENOENT from idle_send Bug 3648 - lib/signals.c handles multiple signals lib/signals.c: better signal handling signals.c doesn't use SA_RESTART for SIGTERM Whitespace cleanup in signals.c idle: close SIGTERM/select race Also use signals_select() in prot_*() functions Max Matveev (1): One copy of common signal processing is enough Thomas Jarosch (3): Clear sigaction struct before passing it to the kernel autoconf: Check for pselect() idled: Add missing signals_poll() call configure.in| 2 +- imap/Makefile.in| 14 +- imap/global.c | 6 + imap/idle.c | 295 ++ imap/idle.h | 36 -- imap/idled.c| 307 +++- imap/idlemsg.c | 215 +++ imap/{idled.h => idlemsg.h} | 45 +-- imap/imapd.c| 65 +- imap/lmtpd.c| 4 +- imap/nntpd.c| 4 +- imap/pop3d.c| 4 +- imap/signals.c | 115 - imap/signals.h | 54 lib/prot.c | 6 +- lib/signals.c | 130 --- lib/signals.h | 5 + 17 files changed, 729 insertions(+), 578 deletions(-) create mode 100644 imap/idlemsg.c rename imap/{idled.h => idlemsg.h} (71%) delete mode 100644 imap/signals.c delete mode 100644 imap/signals.h
[PATCH 2.5] idled: Add missing signals_poll() call
This fixes broken shutdown on platforms without pselect(). A side effect of the "#ifdef HAVE_PSELECT" code path in signals_select() is that it calls signals_poll_mask() on each run. So it will pick up previously delivered signals. --- imap/idled.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/imap/idled.c b/imap/idled.c index b35c7c2..8850ce1 100644 --- a/imap/idled.c +++ b/imap/idled.c @@ -343,6 +343,8 @@ int main(int argc, char **argv) for (;;) { int n; + signals_poll(); + /* check for shutdown file */ if (shutdown_file(NULL, 0)) { /* signal all processes to shutdown */ -- 1.9.3
Re: Re: Failed to create - mailbox is locked
On Thursday, 3. September 2015 22:56:08 Bron Gondwana wrote: > I should include: the plan to fix this is to have intent logging in > mailboxes.db - so we'll create a record for the destination name with a > flag marking the intent to rename, such that you can't possibly create a > rename loop. any mechanism to detect stale entries if the process stalls / dies before actually executing the desired action? Thomas
cyrus 2.4 deadlock identified: SIGALRM race
Hey there, thanks to the recent lock debugging tool[1] and very good luck, I was able to spot the mysterious cyrus 2.4 (and earlier) deadlock. Here's the output from the lock debugger: /usr/cyrus/bin/imapd (pid 3301) holding WRITE lock for /datastore/imap-mails/user/projects/cyrus.index /usr/cyrus/bin/imapd (pid 21130) ++WAITING++ for WRITE lock on /datastore/imap-mails/user/projects/cyrus.index /usr/cyrus/bin/imapd (pid 20536) ++WAITING++ for WRITE lock on /datastore/imap-mails/user/projects/cyrus.index .. Backtrace of process 3301: #0 0xb77c9428 in __kernel_vsyscall () #1 0xb735af91 in __lll_lock_wait_private () from /lib/libc.so.6 #2 0xb72c88fe in _L_lock_9705 () from /lib/libc.so.6 #3 0xb72c66f0 in malloc () from /lib/libc.so.6 #4 0x080b7557 in xzmalloc (size=32) at xmalloc.c:68 #5 0x080a27b6 in seqset_init (maxval=0, flags=1) at sequence.c:59 #6 0x0806d152 in index_tellexpunge (state=0x9421ca8) at index.c:2319 #7 index_tellchanges (state=0x9421ca8, canexpunge=1, printuid=0) at index.c:2370 #8 0x08071041 in index_check (state=0x9421ca8, usinguid=1, printuid=0) at index.c:682 #9 0x080515ae in idle_update (flags=(IDLE_MAILBOX | IDLE_ALERT)) at imapd.c:2833 #10 0x0809abc5 in idle_handler (sig=14) at idle.c:197 #11 #12 0xb72c52d4 in _int_malloc () from /lib/libc.so.6 #13 0xb72c66fa in malloc () from /lib/libc.so.6 #14 0xb74bb21c in ?? () from /usr/lib/libcrypto.so.1.0.0 Backtrace stopped: previous frame inner to this frame (corrupt stack?) Tadaaa! We are in a middle of a malloc() call, SIGALRM triggers for imap idle and does another malloc() call that deadlocks. -> never ever put complex code in signal handlers. Only set a volatile flag and be done with it. After I killed process 3301, all the other processes resumed operation as normal. The good news: This specific deadlock shouldn't happen anymore in 2.5+ as the idle code was refactored a few years ago: -- commit 17eb391b918c394319e4d1fe5985de10128f34d7 Author: Greg BanksDate: Fri Mar 23 17:27:32 2012 +1100 idle: don't use signals, use AF_UNIX dgrams Communications back from idled to imapds are via a message sent on the AF_UNIX socket. The IDLE command is implemented as a select() loop, and there's absolutely nothing that needs to be done in signal handler context. Best of all, no more unexpected delivery of SIGUSR1 or SIGUSER2, assassinating innocent bystander processes. -- @Ken: The keep_alive() function in httpd.c (CalDAV) probably suffers from the same signal handler issue. Cheers, Thomas [1] http://lists.andrew.cmu.edu/pipermail/cyrus-devel/2015-July/003378.html
Re: Re: which quota backend for cyrus imapd?
Hi Bron, On Friday, 7. August 2015 07:48:06 Bron Gondwana wrote: On Sat, Jul 25, 2015, at 01:49, Thomas Jarosch wrote: Hi, which quota backend is the recommended backend nowadays? Obvious candidates are quotalegacy and skiplist. quotalegacy has per-user files while skiplist would use one central file for all users, which means more lock contention. The man page entry for quota_db doesn't give any explanation which backend to prefer. Which backend is used at Fastmail? :) Quotalegacy. And for precisely the reason you might imagine - lower lock contention :) thanks, it's good to know it's used in a large scale installation. There are some fixes in the 2.5 code base that need porting to 2.4, I'll see what I can do about that in September. Mostly minor stuff about closing file descriptors (=unlocking) on error. So those fixes could at least land on the 2.4 branch, independent if there's going to be another release or not. Thomas
/proc/locks debug helper (cyrus deadlock tracing)
Hello, attached is a small helper script that parses the output of /proc/locks. The file format is described here: https://www.centos.org/docs/5/html/5.2/Deployment_Guide/s2-proc-locks.html It resolves the program name from the pid number and also resolves the filename from the inode number. That way we can see which program is currently holding an exclusive lock on a file and which program is waiting for the lock to come free. This would probably also show an ABBA deadlock, though I didn't encounter one in the wild yet. You may need to adapt the paths in the config section of the program to your local setup. The layout of /proc/locks might change depending on the kernel version, I've tested it with 3.14.x. The script works with python 3.x and 2.7. When you call it with --shown-non-waiting, it will show you which mailboxes currently hold a READ lock. The program does a bit of filtering to keep the noise ratio down, so be sure to check out the other options described in --help. Hopefully this helps others if they should run into a deadlock situation. Cheers, Thomas #!/usr/bin/python3 # Parse /proc/locks and list waiting programs. # Useful for identifying the cause of a deadlock. # Should be executed as root for /proc access. # # If you identified the first hanging process, # run gstack pid to get a stack trace. # # Licensed under the same terms as cyrus-imapd 2.4 / 2.5 / 2.6 # (c) 2015 Intra2net AG - Thomas Jarosch import os import re import argparse parser = argparse.ArgumentParser(description='Parse /proc/locks with special ' 'regard for cyrus imapd') parser.add_argument('--all-programs', action='store_true', default=False, help='Show all locks, not just cyrus') parser.add_argument('--show-sockets', action='store_true', default=False, help='Show locks in imap-db/socket/') parser.add_argument('--show-pidfiles', action='store_true', default=False, help='Show locks in /var/run') parser.add_argument('--show-non-waiting', action='store_true', default=False, help=Show locks that don't have waiters) args = parser.parse_args() cyrus_bindir = '/usr/cyrus/bin/' socket_dir = '/datastore/imap-db/socket/' locks = [] # parse locks and waiters with open('/proc/locks') as f: for line in f: # remove double spaces line = line.replace(' ', ' ') # Format examples: # 46: FLOCK ADVISORY WRITE 5542 00:25:829847 0 EOF # 46: - FLOCK ADVISORY WRITE 5544 00:25:829847 0 EOF # 50: POSIX MANDATORY READ 4820 fd:04:3815033 1073741826 1073742335 fields = re.match('\d+: (?Pwaiting- )?[A-Z]+ [A-Z]+ ' '(?PmodeREAD|WRITE) ' '(?Ppid\d+) ' '(?Pdev_major[a-f0-9]{2}):' '(?Pdev_minor[a-f0-9]{2}):' '(?Pinode\d+) .*', line) if not fields: print('WARN: Ignoring unmatched line output: {0}'.format(line)) continue waiting = False if fields.group('waiting') is not None: waiting = True pid = fields.group('pid') proc_path = os.path.join('/proc', pid) if not os.path.isdir(proc_path): print('INFO: Program with pid {0} vanished'.format(pid)) continue prog_name = os.readlink(os.path.join(proc_path, 'exe')) # convert dev number from hex to decimal # as returned in os.stat() decimal_devnode = int('{0}{1}'.format(fields.group('dev_major'), fields.group('dev_minor')), 16) # look up filename locked_filename = 'UNKNOWN' fd_path = os.path.join(proc_path, 'fd') for fd_file in os.listdir(fd_path): fd_fullpath = os.path.join(fd_path, fd_file) stat_res = os.stat(fd_fullpath) if stat_res.st_ino == int(fields.group('inode')) and \ stat_res.st_dev == decimal_devnode: locked_filename = os.readlink(fd_fullpath) break # store new_lock = {'prog': prog_name, 'pid': pid, 'mode': fields.group('mode'), 'file': locked_filename, 'waiters': [] } if waiting: locks[-1]['waiters'].append(new_lock) else: locks.append(new_lock) # Output locks and possible waiters shown_something = False for lock in locks: prog = lock['prog'] file = lock['file'] # Skip known locks that are always waiting if file.startswith(socket_dir) and not args.show_sockets: continue if file.endswith('.pid') and not args.show_pidfiles: continue if not prog.startswith(cyrus_bindir) and \ args.all_programs is False: continue waiters = lock['waiters'] if len(waiters
which quota backend for cyrus imapd?
Hi, which quota backend is the recommended backend nowadays? Obvious candidates are quotalegacy and skiplist. quotalegacy has per-user files while skiplist would use one central file for all users, which means more lock contention. The man page entry for quota_db doesn't give any explanation which backend to prefer. Which backend is used at Fastmail? :) Best regards, Thomas Jarosch
cyrusdb_flat and locking
Hi Bron, the first rough version of the lock validator is running and already has some interesting results: The cyrus flat db backend doesn't do proper locking. 1. When it opens a file in myopen(), the file is not locked. This is a common pattern in cyrus code and harmless. myfetch() acquires an EXCLUSIVE lock when called with a tid pointer only. Funnily the code mentions in multiple places it should release the read-only lock and even calls lock_unlock(). No lock_shared() in the code at all. 2. When storing data, the xxx.NEW file is not locked. Or did I miss anything? I'll try to fix this and also introduce a db-locktype variable to keep track of the current locking state if you don't mind. Cheers, Thomas
Re: Re: [RFC PATCH] Fix index lock breakage in mailbox_release_resources()
Hi Bron, On Sunday, 19. July 2015 10:05:12 you wrote: On Sun, Jul 19, 2015, at 07:06 AM, Thomas Jarosch wrote: mailbox_release_resources() closes the file descriptor to the index. This implicitly releases any lock on the file. Unfortunately the function does not reset the index_locktype state. Therefore we didn't notice that a mailbox rename operated on an unlocked index of the new mailbox. Yikes. How did you discover this? I'm surprised we haven't run into this already. currently cyrus-imapd 2.4.18 deadlocks about once a week. Similar symptoms as described here: http://lists.andrew.cmu.edu/pipermail/info-cyrus/2013-May/036951.html So I'm now going through the locking code. Of course I can't reproduce it on my development machine and I tried really hard. The only pattern I can see is that it happens one machines with very large mailboxes. They run quota -f nightly, ipurge to delete trashed mails older than 30 days and the cyr_getdeleted tool I posted recently. - the more cyrus tools we run in parallel, the worse the problem seems. Note: mailbox_index_lock() implies a mailbox_read_index_header(). But it also does more work than just read the index header, so there might be side effects. I changed to mailbox_index_lock_internal() since we don't want to abort on a DELETED mailbox. ok @@ -4370,7 +4371,7 @@ static int mailbox_reconstruct_create(const char *name, struct mailbox **mbptr) /* Attempt to open index */ r = mailbox_open_index(mailbox); -if (!r) r = mailbox_read_index_header(mailbox); +if (!r) r = mailbox_lock_index(mailbox, LOCK_EXCLUSIVE); if (r) { printf(%s: failed to read index header\n, mailbox-name); syslog(LOG_ERR, failed to read index header for %s, mailbox- name); I'm not convinced by this one - the whole point with reconstruct is that you're working with a potentially broken mailbox, and you want to check each step more simply to see if it works. I've left this one in place. if you back out that change, it won't work anymore: mailbox_read_index_header() will bail out because the index is not locked. I've tested that code path by faking the return code of mailbox_open() so it executes the reconstruct_create() code path. Right now I've interfaced the locking code with helgrind's lock validation mechanism and annotated all lock_xx() calls. That works more or less. Since kernel 3.14.x, there's a userspace version of lockdep called liblockdep. I'll switch to that soon as it supports tryLock() style calls as we do with lock_nonblocking(). The tricky part is that cyrus is a multi-process tool instead of a multi- threaded one, so the current lock validation tools are more or less useless. I came up with a workaround: Dump the lock state information in lock_xxx() to a file, probably JSON. A daemon tool will later on read that lock transation file and feed the information in the lock validation tool / library. Hopefully this works out, I have a feeling we have a classic ABBA deadlock somewhere in the code. Cheers, Thomas
[PATCH 2/2] lock_reopen(): Properly handle EINTR in the unlock-on-error path
EINTR handling in lock_unlock() was introduced with this commit: commit b6f2c3ad6197c4ec1da5da6a79d79c2c93ed1a1c Author: Larry Greenfield l...@andrew.cmu.edu Date: Tue Nov 26 18:13:52 2002 + look for EINTR on unlocks --- lib/lock_fcntl.c | 12 ++-- lib/lock_flock.c | 4 ++-- 2 files changed, 4 insertions(+), 12 deletions(-) diff --git a/lib/lock_fcntl.c b/lib/lock_fcntl.c index 883864e..fcfc1d7 100644 --- a/lib/lock_fcntl.c +++ b/lib/lock_fcntl.c @@ -91,11 +91,7 @@ EXPORTED int lock_reopen(int fd, const char *filename, if (!r) r = stat(filename, sbuffile); if (r == -1) { if (failaction) *failaction = stating; -fl.l_type= F_UNLCK; -fl.l_whence = SEEK_SET; -fl.l_start = 0; -fl.l_len = 0; -r = fcntl(fd, F_SETLKW, fl); +r = lock_unlock(fd, filename); return -1; } @@ -104,11 +100,7 @@ EXPORTED int lock_reopen(int fd, const char *filename, newfd = open(filename, O_RDWR); if (newfd == -1) { if (failaction) *failaction = opening; -fl.l_type= F_UNLCK; -fl.l_whence = SEEK_SET; -fl.l_start = 0; -fl.l_len = 0; -r = fcntl(fd, F_SETLKW, fl); +r = lock_unlock(fd, filename); return -1; } dup2(newfd, fd); diff --git a/lib/lock_flock.c b/lib/lock_flock.c index eed0dd4..3a51139 100644 --- a/lib/lock_flock.c +++ b/lib/lock_flock.c @@ -88,7 +88,7 @@ EXPORTED int lock_reopen(int fd, const char *filename, if (!r) r = stat(filename, sbuffile); if (r == -1) { if (failaction) *failaction = stating; -flock(fd, LOCK_UN); +lock_unlock(fd, filename); return -1; } @@ -97,7 +97,7 @@ EXPORTED int lock_reopen(int fd, const char *filename, newfd = open(filename, O_RDWR); if (newfd == -1) { if (failaction) *failaction = opening; -flock(fd, LOCK_UN); +lock_unlock(fd, filename); return -1; } dup2(newfd, fd); -- 1.9.3
Re: Laying groundwork
Hi Bron, On Saturday, 16. May 2015 09:46:40 Bron Gondwana wrote: So the way to work through all the records in a mailbox was: int r; uint32_t recno; struct index_record record; for (recno = 1; recno = mailbox-i.num_records; recno++) { r = mailbox_read_index_record(mailbox, recno, record); if (r) continue; /* if not looking at expunged records */ if (record.system_flags FLAG_EXPUGNED) continue; [new API] const struct index_record *record; struct mailbox_iter *iter = mailbox_iter_init(mailbox, modseq, iterflags); while ((record = mailbox_iter_step(iter))) { ... } mailbox_iter_done(iter); This allows both a fetch of only records changed since a particular modseq, or a fetch of only records without certain flags - check some of the uses for examples. A common pattern is init(mailbox, 0, ITER_SKIP_EXPUNGED). that's a sound API improvement. I already reviewed the concept last week but never gotten around to actually send the email :) Cheers, Thomas
Re: uniqueid based paths (was: Minutes 16/3)
On Monday, 23. March 2015 23:06:16 Bron Gondwana wrote: What about providing a virtual annotation that allows you to query the uniqueid of a folder via IMAP? That way a client can automatically include that information in debug reports if available. This already exists: C: 12 getmetadata INBOX.dreamcatcher.keffiyeh (/shared/vendor/cmu/cyrus-imapd/uniqueid) S: * METADATA INBOX.dreamcatcher.keffiyeh (/shared/vendor/cmu/cyrus-imapd/uniqueid ae94c253-e6df-4700-901b-96b06b705eaa) S: 12 OK Completed (from a Cassandane test run with debugging turned on) splendid! Thanks for checking. Thomas
Re: uniqueid based paths (was: Minutes 16/3)
On Monday, 23. March 2015 12:09:23 Thomas Jarosch wrote: A tool that lists the user folder - uniqueid based path in a machine parsable way (=scriptable) might help here. after talking to a colleague in development, another idea came up: What about providing a virtual annotation that allows you to query the uniqueid of a folder via IMAP? That way a client can automatically include that information in debug reports if available. Thomas
uniqueid based paths (was: Minutes 16/3)
Hi Bron, On Tuesday, 17. March 2015 09:50:34 Bron Gondwana wrote: Bron: - working on uniqueid-based-paths - we'll need upgrade path for existing installs (maybe detect old path and rename-on-open?) - reconstruct et al will need support for the new paths. Rebuild mailboxes.db from the new cyrus.header with all mailboxes.db data in it. I'm still wondering if there's an easier solution to the uniqueid based path problem while keeping a human readable filesystem structure. That would make backup and restore with traditional unix backup tools much easier. I'll post a small tool we developed for cyrus 2.4 later on. Having fast, atomic renames is certainly a welcoming improvement, the rename stuff has bitten us often, too. My colleague Gerd v. Egidy wrote this the last time on the topic: --- Both are binary structures which are cyrus version dependent and require a working cyrus installation of this version to be of any value. This means selecting the files to restore from backup has to be tightly integrated with a running cyrus which really complicates things. It is also a big burden for version-independent long-term storage. Currently I can take a snapshot from a 10 year old cyrus version, throw away all mailboxes.db and cyrus.headers, and recreate a working cyrus 2.4 installation from that just using simple shell scripts. I lose flags, acls and other metadata, but I get all the mails and folders, and that is what counts most for the users. --- If we switch to uniqueid based paths, what about providing symlinks to a virtual directory structure? Updating symlinks would probably not be atomic since we can't do a cyrus.header update and update a symlink at the same time. Though we could only complete the transaction in mailboxes.db when all steps are done. So this could work even when imapd is shut down in the middle of a rename. May be the symlinks might even help during the migration / upgrade phase? What about writing out the folder information from cyrus.header also as JSON text files? This file is not interpreted by cyrus normally, but it could be used by other tools and aid long term storage of backups. In our backup system, we only keep text dumps of mailboxes.db and annotations.db for maximum compatibility. Those get converted back on restore. Cheers, Thomas
imapd 2.4: Tool to list deleted messages and folders
Hi, we are doing backups of our imap mailboxes using simple unix tools like find + afio. Sometimes, when a mailbox is still locked, deleted files were not expunged immediately and therefore ended up in the backup. During testing we found an additional problem with re-creating a deleted folder and fixed this one in the same run, too. Attached you'll find a few tools for cyrus imapd 2.4(.17): 1. cyr_getdeleted: Lists deleted IMAP messages for a given mboxspec. (+ additional patch to add it to the build system) 2. A patch for cyrus to create a .deleted file when a folder is marked for deletion. If the folder is re-created before it was removed from disk and we find the .deleted file, we clear out all *stale* messages. 3. find_wrapper.py: A wrapper around find(1) to automatically exclude deleted messages and folders. Useful for backup shell scripts. Makes use of the .deleted file, too. I'm putting this out here in case these additions might come in handy for other people running cyrus imapd 2.4. Most of those tools and patches were developed by my colleague Samir Aguiar. As written in the Mailbox deletion race: folders and files are never deleted email[1] , we'll especially revisit issue 2. when testing cyrus 2.5.x Have a nice day, Thomas [1] http://lists.andrew.cmu.edu/pipermail/cyrus-devel/2015-February/003133.html /* Same license as the cyrus imapd (c) 2015 Intra2net AG - Samir Aguiar */ #include config.h #include unistd.h #include stdlib.h #include stdio.h #include string.h #include global.h #include cyr_lock.h #include exitcodes.h #include mboxlist.h #include mailbox.h /* config.c stuff */ const int config_need_data = 0; /* current namespace */ static struct namespace deleted_namespace; static int quiet = 1; #define log_msg(fmt, ...) \ do { \ if (!quiet) \ { \ fprintf (stdout, fmt, __VA_ARGS__); \ fflush(stdout); \ } \ } while(0) #define log_msg_v(fmt, ...) \ do { \ fprintf (stdout, fmt, __VA_ARGS__); \ fflush(stdout); \ } while(0) #define fatal_quit(msg) fprintf(stderr, msg); exit(EXIT_FAILURE); /* * Callback function that gets called for each mailbox found. * * Returns the name of the deleted and expunged files set to be * unlinked when the last lock of the mailbox is released. * * If the mailbox itself was deleted, it returns only its name * because the files within are also going to be unlinked later. * * We can't return a nonzero value because in that case mboxlist_findall() * stops searching. */ int get_expunged_data(const char *mailbox_name, int matchlen __attribute__((unused)), int maycreate __attribute__((unused))) { struct mailbox *mailbox = NULL; /* Load with a shared lock */ if (mailbox_open_irl(mailbox_name, mailbox) != 0) { log_msg(Failed to open mailbox %s\n, mailbox_name); return 0; } if (mailbox-mbtype MBTYPE_REMOTE) { log_msg(Skipping remote mailbox: %s, mailbox_name); return 0; } struct index_record *records = NULL; struct index_record *record; uint32_t recno; int alloc = 0; int num = 0; for (recno = 1; recno = mailbox-i.num_records; recno++) { /* Pre-allocate more space */ if (alloc = num) { alloc += 64; records = xrealloc(records, sizeof(struct index_record) * alloc); } record = records[num]; if (mailbox_read_index_record(mailbox, recno, record)) continue; if (record-system_flags FLAG_UNLINKED) { const char *record_path = mailbox_message_fname(mailbox, record-uid); // some files stay listed even after unlinked if (access(record_path, R_OK) == 0) log_msg_v(%s\n, record_path); } num++; } const char *path; path = mboxname_datapath(mailbox-part, mailbox-name, 0); if (path) log_msg(Done with mailbox %s\n, path); free(records); mailbox_close(mailbox); return 0; } void usage(const char *name) { log_msg_v(usage: %s [-v] mailbox path\n Print all files to be unlinked from the mailboxes in the given path.\n Those files correspond to messages that were deleted and expunged\n and will be unlinked when all connections to the mailbox are closed.\n\n mailbox path has to be a relative path, e.g. user/admin\n\n Valid arguments:\n -v\t\tverbose, log every action and not only filenames\n -h\t\thelp, print this message\n, name); exit(EXIT_SUCCESS); } int main(int argc, char **argv) { struct mboxlist_entry mbentry; struct mailbox *mailbox = NULL; int rc; char buf[MAX_MAILBOX_PATH]; if ((geteuid()) == 0 (become_cyrus() != 0)) { fatal_quit(must run as the Cyrus user); } int
Re: 2.5 preparation status
Hi Bron, I just noticed an important bugfix for ACL removal is still sitting in the bug tracker untouched. Please have a look at https://bugzilla.cyrusimap.org/show_bug.cgi?id=3824 Without it the ACL strings get messed up in mailboxes.db on current glibc versions. We run the patch for about a year without any further ACL string defects. Would be nice to see it applied to git master :) Cheers, Thomas
Re: Re: 2.5 preparation status
On Friday, 12. December 2014 11:07:39 you wrote: Is not possible to user strmove instead of strcpy? POSIX doesn't support strmove() AFAIK. - it's non-standard. Thomas
imapd v2.4.17: UID SEARCH ALL vs. UID SEARCH 1:*
Hi, I just discovered an interesting behavior and I'm wondering why cyrus-imapd 2.4.17 behaves the way it does: Basically I do this: - Select a mailbox - Get the list of UIDs via UID SEARCH 1:* - Add a new message in another client / connection - Get the list of UIDs again via UID SEARCH 1:* - the UID list stays the same on the original connection, it does send an increased EXISTS counter though. If I reissue the UID SEARCH 1:* command, it includes the new UID. If I do the same with the UID SEARCH ALL command, it immediately shows the new UID after adding the message. An updated EXISTS response is sent, too. Any clue what's going on? Why is the ALL keyword more special than 1:*? Here's the protocol dump: . SELECT INBOX/EntwAPw-rfe * 1 EXISTS * 0 RECENT * FLAGS (\Answered \Flagged \Draft \Deleted \Seen $mdnsent) * OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen $mdnsent \*)] Ok * OK [UIDVALIDITY 1387207616] Ok * OK [UIDNEXT 100015] Ok * OK [HIGHESTMODSEQ 23] Ok * OK [URLMECH INTERNAL] Ok . OK [READ-WRITE] Completed . UID SEARCH 1:* * SEARCH 1 . OK Completed (1 msgs in 0.000 secs) -- new message was added in between by another client -- . UID SEARCH 1:* * 2 EXISTS * 0 RECENT * SEARCH 1 . OK Completed (1 msgs in 0.000 secs) -- doing another UID SEARCH 1:* -- . UID SEARCH 1:* * SEARCH 1 100015 . OK Completed (2 msgs in 0.000 secs) *** now the same with UID SEARCH ALL *** . SELECT INBOX/EntwAPw-rfe * 2 EXISTS * 0 RECENT * FLAGS (\Answered \Flagged \Draft \Deleted \Seen $mdnsent) * OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen $mdnsent \*)] Ok * OK [UIDVALIDITY 1387207616] Ok * OK [UIDNEXT 100016] Ok * OK [HIGHESTMODSEQ 24] Ok * OK [URLMECH INTERNAL] Ok . OK [READ-WRITE] Completed . UID SEARCH ALL * SEARCH 1 100015 . OK Completed (2 msgs in 0.000 secs) -- new message was added in between by another client -- . UID SEARCH ALL * 3 EXISTS * 0 RECENT * SEARCH 1 100015 100016 . OK Completed (3 msgs in 0.000 secs) Cheers, Thomas
[RFC 5162] QRESYNC: Clarification on CLOSED response
Hello Alexey and Dave, after updating an IMAP server to a version supporting QRESYNC, we started facing unexpected issues with one client. (server: cyrus-imapd 2.4.17, client: c-client from Mark Crispin) Of special interest became the CLOSED response: RFC 5162, sect 3.7 states that any server implementing the QRESYNC extension must send the CLOSED response on mailbox changes via SELECT/EXAMINE. This seems to happen unconditionally, regardless if the client enabled QRESYNC or CONDSTORE. Unfortunately the c-client IMAP implementation uses a fake CLOSED response internally to mark a connection as broken - the client goes offline during a folder change. Clearly the client needs to be fixed since RFC 3501 states: Client implementations SHOULD ignore response codes that they do not recognize. *** The big question *** Was it intentional that the CLOSED response should always be sent? If yes, what was the intention? May be the wording can be clarified a bit before the updated RFC is final: Should the server send the CLOSED response even if QRESYNC is not enabled? Best regards, Thomas Jarosch
Crash on index upgrade: failed to mmap new message file
Hi, I just encountered one system where cyr_expire / reconstruct / imapd process died during the index upgrade from cyrus 2.3.16 to 2.4.17. The error message was: fatal error: failed to mmap new message file To make a long debug session short: One folder had a broken cyrus.index file that seemed to contain an index record for uid 0 (- invalid record). (I saved the file in case it's of interest) Consider this call stack: upgrade_index_record(): - mailbox_message_fname(mailbox, record-uid) - message_parse(fname, record) When mailbox_message_fname(mailbox, record-uid) is called for uid 0, the function returns the folder name without any message filename. In my case that was .../user/xxx/Sent items instead of .../user/xxx/Sent items/123.. message_parse() then calls map_refresh() with the folder name (ugh!) and dies with the above error message. The workaround was to kill the broken cyrus.index file and run reconstruct. @Bron: Do you know by chance if this error case is detected in 2.5? I took a quick look at the code in git HEAD and it seems to behave the same. Cheers, Thomas
Re: Robust atomic multiple folder rename
Hi Bron, On Thursday, 2. January 2014 09:52:13 Bron Gondwana wrote: At this point, we go through all the old folder and delete the on-disk files. Once that's done, we can do a single update in the mailboxes.db: user.foo.sub %(TYPE DELETED) (we keep DELETED tombstones in mailboxes.db now to ensure that UIDVALIDITY never gets reused, but also to detect the difference between folder created on A and folder deleted on B in a multi-master replication setup) *snip* (looking good) So I guess the flexible mboxlist format stores the UIDVALIDITY in mailboxes.db (I haven't looked at in git yet). Let's assume the user creates and renames a folder with the same name multiple times. We'll end up with multiple user.foo.sub %(TYPE DELETED) user.foo.sub %(TYPE DELETED) user.foo.sub %(TYPE DELETED) user.foo.sub %(TYPE DELETED) entries in mailboxes.db. One corner case we should think of is if we recover from a crash and mailboxes.db looks like this: user.foo.sub %(TYPE DELETED) user.foo.sub %(TYPE DELETED) user.foo.sub %(TYPE DELETED) user.foo.sub %(TYPE DELETED) user.foo.sub %(NAMELOCK user.foo.sub TYPE RENAMELOCKED) user.foo.new %(NAMELOCK user.foo.sub TYPE RENAMETEMP) As far as I can tell, the new scheme would work fine here, too. We just need to be careful to ignore the DELETED entries in all tools that juggle around the mboxlist entries. Best regards, Thomas
Downgrading from 2.4.x to 2.3.x (testing only)
Hi Bron, I'm currently testing the upgrade path from 2.3.16 to 2.4.17. Everything is smooth so far, mailbox indexes were upgraded automatically after starting 2.4.17 for the first time: -- Dec 18 13:48:00 ctl_cyrusdb[14851]: checkpointing cyrus databases Dec 18 13:48:00 ctl_cyrusdb[14851]: done checkpointing cyrus databases Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.admin (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.admin.EntwAPw-rfe (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.admin.GelAPY-schte Elemente (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.admin.Gesendete Elemente (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.cyrus (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.mueller (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.mueller.EntwAPw-rfe (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.mueller.GelAPY-schte Elemente (10 - 12) Dec 18 13:48:00 cyr_expire[14850]: Index upgrade: user.mueller.Gesendete Elemente (10 - 12) -- For the fun of it (doing it for science!) I downgraded again to 2.3.16 while keeping the new index files. Every access to the mailbox via IMAP was blocked with Future index version xxx. Good. - So the indexes are protected against a downgrade without a reconstruct. Will mailboxes.db / annotations.db also stay fine after a downgrade? I don't think there are any data structure differences between 2.3.x and 2.4.x in there but I thought I'd better ask. Cheers, Thomas
Re: [PATCH/FEATURE] Per-user XLIST support for cyrus-imapd
On Wednesday, 1. May 2013 00:03:20 Bron Gondwana wrote: What are your main blockers to moving to 2.4? I suspect the reindexing IO his is a major factor, and something I'd like to solve (by delaying the inevitable really... there are enough cache format changes that reindexing will be necessary at some point - but stopping the storm is important) Surprisingly it's QA: We have many scripts and automated systems that integrate into cyrus (backup restore, automatic sieve script creation, new user creation etc) that need extensive testing. Some of those scripts have assumptions about internal cyrus stuff (sic!) like the hierarchy separator or the filename / directory layout for performance reasons. We already started writing autotest[1] tests for our cyrus subsystem, but it's still too incomplete for a smooth upgrade test. [1] http://autotest.github.io/ As far as I understood it, Outlook 2013 does not implement the RFC properly. Or did I miss something? That's the thing we are trying to solve here. Problem is you can't configure the sent mail folders manually anymore. Grrr. I'm not certain actually - I'd have to have a look at your patches. I know some clients expect to see flags in response to normal LIST commands as well, which is annoying - but we added an option to always return them, even to regular LIST. ok I have no problem with people doing their own things. We had our own separate notifications framework which we were running at FastMail for over a year before the eventsource stuff got merged upstream and we dropped our patches in favour of rewriting our notifdy to be an eventsource listener instead. Competing development is good. In this particular case, the only reason I'm not interested in your patches instead of my own work is that I really want to allow the /shared/specialuse annotation to work rather than hard coding folder names in imapd.conf - and the reason I want THAT is because we need to support localisation of folder names pretty soon, and hence the xlist-* trick won't work any more. localization is one of the reasons why we needed this per user, too. We thought about storing the folder names inside an annotation too, but since the specialuse stuff is in 2.5, it would have been even more wasted time :) The good side of storing this in a per-user config _file_ is easy access for the administrator or third party scripts. Otherwise you need another tool to manipulate the annotation (+authentication hassle). Besides, standards are good if they provide everything you need. Thanks for your contributions, and for making them available. Maybe they'll be useful to other people who aren't ready for 2.4/2.5 yet :) exactly :) Thomas
[PATCH] Fix incorrect readlink() buffer handling
Also don't mix tabs and spaces. Creation of a new bug on cyrus bugzilla currently doesn't work for me (blank page on submit). Signed-off-by: Thomas Jarosch thomas.jaro...@intra2net.com --- timsieved/actions.c | 17 +++-- 1 files changed, 11 insertions(+), 6 deletions(-) diff --git a/timsieved/actions.c b/timsieved/actions.c index 239d01d..347501c 100644 --- a/timsieved/actions.c +++ b/timsieved/actions.c @@ -474,20 +474,25 @@ static int isactive(char *name) { char filename[1024]; char activelink[1024]; +ssize_t link_len; snprintf(filename, 1023, %s.bc, name); memset(activelink, 0, sizeof(activelink)); -if ((readlink(defaultbc, activelink, sizeof(activelink)-1) 0) - (errno != ENOENT)) + +link_len = readlink(defaultbc, activelink, sizeof(activelink)-1); +if (link_len == -1) { - syslog(LOG_ERR, readlink(defaultbc): %m); - return FALSE; +if (errno != ENOENT) +syslog(LOG_ERR, readlink(defaultbc): %m); + +return FALSE; } +activelink[link_len] = '\0'; if (!strcmp(filename, activelink)) { - return TRUE; +return TRUE; } else { - return FALSE; +return FALSE; } } -- 1.7.4.4
Re: [PATCH] Fix incorrect readlink() buffer handling
On Thursday, 13. October 2011 10:13:54 Thomas Jarosch wrote: Also don't mix tabs and spaces. I guess my mail client will eat the tabs in the patch. Here's the patch as file for easy git am consumption. Cheers, Thomas From 6251ba2071e6adf05f630608cbf0768417ad9d3f Mon Sep 17 00:00:00 2001 From: Thomas Jarosch thomas.jaro...@intra2net.com Date: Thu, 13 Oct 2011 10:04:01 +0200 Subject: [PATCH] Fix incorrect readlink() buffer handling Also don't mix tabs and spaces. Signed-off-by: Thomas Jarosch thomas.jaro...@intra2net.com --- timsieved/actions.c | 17 +++-- 1 files changed, 11 insertions(+), 6 deletions(-) diff --git a/timsieved/actions.c b/timsieved/actions.c index 239d01d..347501c 100644 --- a/timsieved/actions.c +++ b/timsieved/actions.c @@ -474,20 +474,25 @@ static int isactive(char *name) { char filename[1024]; char activelink[1024]; +ssize_t link_len; snprintf(filename, 1023, %s.bc, name); memset(activelink, 0, sizeof(activelink)); -if ((readlink(defaultbc, activelink, sizeof(activelink)-1) 0) - (errno != ENOENT)) + +link_len = readlink(defaultbc, activelink, sizeof(activelink)-1); +if (link_len == -1) { - syslog(LOG_ERR, readlink(defaultbc): %m); - return FALSE; +if (errno != ENOENT) +syslog(LOG_ERR, readlink(defaultbc): %m); + +return FALSE; } +activelink[link_len] = '\0'; if (!strcmp(filename, activelink)) { - return TRUE; +return TRUE; } else { - return FALSE; +return FALSE; } } -- 1.7.4.4
Re: [PATCH] Fix incorrect readlink() buffer handling
Hi Bron, On Thursday, 13. October 2011 11:03:16 you wrote: On 10/13/2011 10:13 AM, Thomas Jarosch wrote: Also don't mix tabs and spaces. Please don't do this. I know the tabs/spaces standard is ugly, but that's what this codebase has. Normally I try to avoid such changes. I was unsure in this case and oriented myself at the function below it, which doesn't use tabs at all. astyle anyone? ;) What about the actual code change? Cheers, Thomas
Re: [PATCH] Fix incorrect readlink() buffer handling
On Thursday, 13. October 2011 12:15:17 Dave McMurtrie wrote: @Dave: Any issue known with bugzilla? Not that I'm aware of. Why do you ask? If there's something broken, let me know and I'll take care of it. I tried to create a new bug (twice) and it started processing after hitting submit. Shortly after that I only got a blank page and no bug was created. If other people still can enter new submissions, then something is messed up here locally. So just sit back and watch if new submission come in :) Cheers, Thomas
Re: Cyrus Imap and Automake
Hi Дилян, here's some feedback about your build system question. Note: I'm not one of the cyrus core developers. if I rewrite the build system of Cyrus imap 2.4(.10) to use Automake to generate the Makefile.in-files, will the patch be accepted in reasonable time in git/master? Have you considered alternatives to GNU Autotools? We have experience with GNU Autotools in our company projects as well as open source projects for several years now. We have found that it has several shortcomings: 1. Autotools version conflicts You can compile a released source package without any Autotools on your system. But as soon as you a) want to develop b) want to install a patch which modifies the build system (like a new path to a library, something that adds a new file,...). This is often happens as part of packaging for .rpm or .deb. you need Autotools on your machine. If the Autotools version on your machine and the one used to build the release are not compatible you can't build. Installing a different Autotools version on a given distribution without breaking something or fixing a huge list of dependency problems is nearly impossible. I have experience with this... 2. Build speed and parallelization The configure run can't be run on multiple cpu cores at once and is usually slow. Running compile jobs on multiple cores at once can become tricky if you have internal dependencies like libs that are included in your project. Sometimes we experienced race conditions or cases where you had to start the build process two or three times to get a successful build. 3. Documentation and learning There is a book about Autotools (http://sourceware.org/autobook/). I have read it and don't think that it explains how to use and troubleshoot Autotools in a good way. Maybe it's hard to explain because in Autotools everything is a macro that calls a macro that calls a macro... I think the learning curve of Autotools is very steep and finding out what's really happening during troubleshooting is very time consuming. Because of these shortcomings we have used CMake for new projects for about a year. We have made good experience with it. We have now started to port all Autotools projects over to CMake. Please take a look at CMake and consider it before you start investing time into Autotools. Best regards, Thomas
Re: [patch] Improved duplicate suppression
On Wednesday, 20. July 2011 11:05:01 Sébastien Michel wrote: You are two to compete for this patch :) http://asg.andrew.cmu.edu/archive/message.php?mailbox=archive.cyrus-devel msg=2984 Same company ;) Kristóf is our new software engineering intern from Budapest University of Technology and Economics, Hungary. Best regards, Thomas Jarosch
Duplicate suppression and message-id reuse
Hello, I recently had to disable duplicate suppression on our mailservers as a broken mail client (=some Outlook versions) reuses the message-id if you forward a message. Long story short, the users are now complaining about duplicated emails instead of never received messages... Here's my proposal to make duplicate suppression more robust: Combine the message-id header field with the date header field for the duplicate check. If a broken client reuses the message-id, there's a good chance the date field changed. Opinions? Best regards, Thomas Jarosch
Re: Cyrus IMAPd 2.2.13p1 2.3.15 Released
On Wednesday, 9. September 2009 18:43:43 Dave McMurtrie wrote: TJ Regarding the buffer overflow: The cert website currently outputs a TJ Lotus Notes exception. Is the overflow theoretically exploitable TJ via a malicious email or does a user need to upload a malicious TJ sieve script? Hmmm... Still down... Apologies for the CERT vulnerability link not existing. We had planned, along with CERT, to make a coordinated announcement about this tomorrow in order to give the major Cyrus vendors a chance to get their distributions patched. Unfortunately, Debian put out their DSA over the weekend so we didn't want to wait until tomorrow to put out our announcement. CERT provided that URL for us, but since they haven't yet formally released this vulnerability the URL isn't active yet. Thanks for clearing this up! I'm very happy this is not triggerable via a malicious email :) Thomas
Re: Last call for Cyrus 2.3.14
On Tuesday, 24. March 2009 18:21:39 Wesley Craig wrote: On 24 Mar 2009, at 06:47, Thomas Jarosch wrote: Thanks for calling. I don't like to spoil the party, what about the issue which allows to create empty ACLs? I seem to have forgotten to open a bug report, what do you think is the best approach? 2.3.14 is more or less baked and people have been testing it. This bug has been around since sometime in 2004, and the solution needs some discussion. So, please open a bug report. Ok, filed as #3147: https://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3147 btw: There was no 2.3.x version number in bugzilla's create new bug report dialog. After the bug was created, I was able to change the version. Have a nice weekend, Thomas
Re: Last call for Cyrus 2.3.14
Hi Ken, On Tuesday, 24. March 2009 11:32:10 Ken Murchison wrote: If there are any outstanding issues that you believe still need to be addressed in 2.3.14, please let me know ASAP. Thanks for calling. I don't like to spoil the party, what about the issue which allows to create empty ACLs? The subject of the discussion was [RFC PATCH] Prevent setacl for empty identifiers I seem to have forgotten to open a bug report, what do you think is the best approach? Best regards, Thomas
Re: [RFC PATCH] Prevent setacl for empty identifiers
Hi Wes, rfc4314 seems to specifically disallow empty identifiers. Also, I think you patch would probably permit an identifier of -. The check is done after the - handling so it should take care of it. BTW, I have a patch to this code that I'm currently holding, which introduces a leading + to identifiers. It's for the case of XFERing mailboxes with invalid ACLs, i.e., a leading + means permit canonicalization to fail. Speaking of canonicalization, I wonder that the canonicalization routines would allow empty IDs... looks like auth_krb5.c:mycanonifyid() probably wouldn't, and auth_unix.c:mycanonifyid() used to but now doesn't. Perhaps the problem is this: https://bugzilla.andrew.cmu.edu/cgi-bin/cvsweb.cgi/src/cyrus/lib/auth_unix.c.diff?r1=1.37;r2=1.38 Removing those lines allows canonicalization of zero length IDs. Can't be a good thing, even outside of ACLs. Good catch. I'm wondering why that code in auth_unix.c was changed at all? There must be a valid use case (?) to it. How do we go from here? Once we agree on a patch(set), I could open a bug report, if that helps any. Cheers, Thomas
Re: Problem with skiplist
On Thursday, 11. September 2008 14:40:03 you wrote: We are using Linux and uname -a Linux ms01 2.6.5-7.276.PTF.205323.0-bigsmp #1 SMP Mon Jul 24 10:45:31 UTC 2006 i686 At http://www.intevation.de/roundup/kolab/issue840 we saw that someone recomended a linux 2.6.19.2 http://www.intevation.de/roundup/kolab/msg12474 IIRC the fix was made in 2.6.20, so it's likely 2.6.19.2 contains it, too. Judging from your long kernel version number, your linux distribution vendor might have already backported the fix. You'll have to check your kernel package sources for mmap() fixes or just upgrade to something recent. Thomas
Re: Time for 2.3.13?
Hello Bron and Wes, I've experienced another bug and I'm not sure if the skiplist/locking code is at fault or not. I was deleting huge amounts of mail in many subfolders on a slow box. The tool that started the request got a timeout after some minutes and this left cyrus-imapd in a broken state: The mailboxes were still listed in mailboxes.db and the mails were still on disc. Every cyrus.* file in the mail directories were already deleted and so cyrus was unable to recover from this. Is this somehow related or is this another issue that needs tracing? Thomas
[patch] mailbox select option for ipurge
Hello Ken, here's a repost of a patch from 2005 that somehow got lost on cyrus-devel: (http://marc.info/?l=cyrus-develm=112134511310688w=2) The patch still works and applies fine to cyrus-imapd 2.3.12p2 Thomas --- Hello, I needed a way to purge only a specific mailbox using the ipurge utility. -f wasn't the way to go as it's recursive. Please have a look at the attached patch. The new option was first called -F mailbox but I can happen too easy that you specify -f mailbox and delete all your mail, so I changed it to -M mailbox. The verbose output was also improved to show which folders were skipped. The patch was developed for cyrus-imapd 2.2.12 and applies cleanly to HEAD. Is this mailinglist or bugzilla the right place to submit patches? Best regards, Thomas Jarosch --- diff -u -r cyrus-imapd-2.2.12/imap/ipurge.c cyrus-imapd.ipurge/imap/ipurge.c --- cyrus-imapd-2.2.12/imap/ipurge.c Thu Jul 14 10:23:26 2005 +++ cyrus-imapd.ipurge/imap/ipurge.c Thu Jul 14 14:13:44 2005 @@ -101,6 +101,7 @@ int verbose = 1; int forceall = 0; +char *mailbox_only = NULL; int purge_me(char *, int, int); int purge_check(struct mailbox *, void *, char *); @@ -115,7 +116,7 @@ if (geteuid() == 0) fatal(must run as the Cyrus user, EX_USAGE); - while ((option = getopt(argc, argv, C:hxd:b:k:m:fsXi)) != EOF) { + while ((option = getopt(argc, argv, C:hxd:b:k:m:fM:sXi)) != EOF) { switch (option) { case 'C': /* alt config file */ alt_config = optarg; @@ -150,6 +151,12 @@ case 'f' : { forceall = 1; } break; +case 'M' : { + if (optarg == 0) { + usage(argv[0]); + } + mailbox_only = xstrdup(optarg); +} break; case 's' : { skipflagged = 1; } break; @@ -183,6 +190,12 @@ quotadb_init(0); quotadb_open(NULL); + if (mailbox_only) { + mboxname_hiersep_tointernal(purge_namespace, mailbox_only, + config_virtdomains ? + strcspn(mailbox_only, @) : 0); + } + if (optind == argc) { /* do the whole partition */ strcpy(buf, *); (*purge_namespace.mboxlist_findall)(purge_namespace, buf, 1, 0, 0, @@ -211,11 +224,12 @@ int usage(char *name) { - printf(usage: %s [-f] [-s] [-C alt_config] [-x] [-X] [-i] {-d days | -b bytes|-k Kbytes|-m Mbytes}\n\t[mboxpattern1 ... [mboxpatternN]]\n, name); + printf(usage: %s [-f] [-M mailbox] [-s] [-C alt_config] [-x] [-X] [-i] {-d days | -b bytes|-k Kbytes|-m Mbytes}\n\t[mboxpattern1 ... [mboxpatternN]]\n, name); printf(\tthere are no defaults and at least one of -d, -b, -k, -m\n\tmust be specified\n); printf(\tif no mboxpattern is given %s works on all mailboxes\n, name); printf(\t -x specifies an exact match for days or size\n); printf(\t -f force also to delete mail below user.* and INBOX.*\n); + printf(\t -M delete mail only in a single mailbox\n); printf(\t -s skip over messages that are flagged.\n); printf(\t -X use delivery time instead of date header for date matches.\n); printf(\t -i invert match logic: -x means not equal, date is for newer, size is for smaller.\n); @@ -228,11 +242,26 @@ struct mailbox the_box; interror; mbox_stats_t stats; + intforce_mailbox; + + /* only delete mail in specified folder? */ + force_mailbox = 0; + if (mailbox_only) { + if (strcmp(name, mailbox_only)) { + if (verbose) + printf(Skipping mailbox %s\n, name); + return 0; + } else + force_mailbox = 1; + } - if( ! forceall ) { + if (!forceall !force_mailbox) { /* DON'T purge INBOX* and user.* */ - if (!strncasecmp(name,INBOX,5) || mboxname_isusermailbox(name, 0)) + if (!strncasecmp(name,INBOX,5) || mboxname_isusermailbox(name, 0)) { + if (verbose) + printf(Skipping mailbox %s\n, name); return 0; + } } memset(stats, '\0', sizeof(mbox_stats_t)); diff -u -r cyrus-imapd-2.2.12/man/ipurge.8 cyrus-imapd.ipurge/man/ipurge.8 --- cyrus-imapd-2.2.12/man/ipurge.8 Thu Nov 20 19:47:48 2003 +++ cyrus-imapd.ipurge/man/ipurge.8 Thu Jul 14 11:31:27 2005 @@ -48,6 +48,10 @@ .B \-f ] [ +.B \-M +.I mailbox +] +[ .B \-C .I config-file ] @@ -92,6 +96,7 @@ by default only deletes mail below shared folders, which means that mails in mailbox(es) below INBOX.* and user.* stay untouched. Use the option \fB-f\fR to also delete mail in mailbox(es) below these folders. +You can explicitly delete mail in a single mailbox via \fB-M mailbox\fR. .PP .I Ipurge reads its configuration options out of the @@ -101,6 +106,9 @@ .TP .BI \-f Force deletion of mail in \fIall\fR mailboxes. +.TP +.BI \-M mailbox +Force deletion of mail in a single mailbox. .TP .BI \-C config-file Read configuration options from \fIconfig-file\fR.
[patch] Improve unix socket permissions
Hello together, currently unix sockets get created by cyrus-master with ownership of root.root and file mode 0777. Attached patch makes the user, group and file mode configurable. If nothing is specified in cyrus.conf, it defaults to CYRUS_USER (+group of the user) and mode 660 for improved security. Would be nice if someone on BSD / unix could give it a try as the file mode is set via umask() instead of chmod() to prevent a race condition during creation of the socket. The patch runs fine with cyrus-imapd 2.3.12p2 on linux. Cheers, Thomas diff -u -r cyrus-imapd-2.3.12p2/master/master.c cyrus-imapd.patched/master/master.c --- cyrus-imapd-2.3.12p2/master/master.c Tue Apr 15 20:11:52 2008 +++ cyrus-imapd.patched/master/master.c Thu Aug 21 16:07:00 2008 @@ -71,6 +71,8 @@ #include sysexits.h #include errno.h #include limits.h +#include pwd.h +#include grp.h #ifndef INADDR_NONE #define INADDR_NONE 0x @@ -302,6 +304,42 @@ return statbuf.st_mode S_IXUSR; } +static int change_socket_ownership(const char *path, const char *owner, const char *group) +{ +/* -1 is don't change */ +uid_t socket_uid = -1; +gid_t socket_gid = -1; +int rtn = 0; + +if (owner strlen(owner) 0) { +struct passwd *pw_user = getpwnam(owner); +if (pw_user) { +socket_uid = pw_user-pw_uid; +socket_gid = pw_user-pw_gid; +} else { +syslog(LOG_ERR, user %s not found, owner); +rtn = -1; +} +} + +if (group strlen(group) 0) { +struct group *pw_group = getgrnam (group); +if (pw_group) { +socket_gid = pw_group-gr_gid; +} else { +syslog(LOG_ERR, group %s not found, group); +rtn = -2; +} +} + +if (chown(path, socket_uid, socket_gid) != 0) { +syslog(LOG_ERR, can't change ownership of file %s to uid %u/gid %u, path, socket_uid, socket_gid); +rtn = -3; +} + +return rtn; +} + void service_create(struct service *s) { struct service service0, service; @@ -437,7 +475,9 @@ } #endif - oldumask = umask((mode_t) 0); /* for linux */ +/* this will set permissions of newly created UNIX sockets to s-socket_mode */ + oldumask = umask((mode_t) 0777 ^ s-socket_mode); + r = bind(s-socket, res-ai_addr, res-ai_addrlen); umask(oldumask); if (r 0) { @@ -447,13 +487,11 @@ syslog(LOG_ERR, unable to bind to %s socket: %m, s-name); continue; } - - if (s-listen[0] == '/') { /* unix socket */ - /* for DUX, where this isn't the default. - (harmlessly fails on some systems) */ - chmod(s-listen, (mode_t) 0777); - } - + +/* Change owner and group of unix socket */ +if(s-listen[0] == '/') +change_socket_ownership(s-listen, s-socket_owner, s-socket_group); + if ((!strcmp(s-proto, tcp) || !strcmp(s-proto, tcp4) || !strcmp(s-proto, tcp6)) listen(s-socket, listen_queue_backlog) 0) { @@ -1294,6 +1332,9 @@ int babysit = masterconf_getswitch(e, babysit, 0); int maxforkrate = masterconf_getint(e, maxforkrate, 0); char *listen = xstrdup(masterconf_getstring(e, listen, )); +char *socket_owner = xstrdup(masterconf_getstring(e, socket_owner, CYRUS_USER)); +char *socket_group = xstrdup(masterconf_getstring(e, socket_group, )); +mode_t socket_mode = (mode_t) masterconf_getoctal(e, socket_mode, 0660); char *proto = xstrdup(masterconf_getstring(e, proto, tcp)); char *max = xstrdup(masterconf_getstring(e, maxchild, -1)); rlim_t maxfds = (rlim_t) masterconf_getint(e, maxfds, 256); @@ -1362,6 +1403,12 @@ Services[i].listen = listen; if (Services[i].proto) free(Services[i].proto); Services[i].proto = proto; + +if (Services[i].socket_owner) free(Services[i].socket_owner); +Services[i].socket_owner = socket_owner; +if (Services[i].socket_group) free(Services[i].socket_group); +Services[i].socket_group = socket_group; +Services[i].socket_mode = socket_mode; Services[i].exec = tokenize(cmd); if (!Services[i].exec) fatal(out of memory, EX_UNAVAILABLE); diff -u -r cyrus-imapd-2.3.12p2/master/master.h cyrus-imapd.patched/master/master.h --- cyrus-imapd-2.3.12p2/master/master.h Mon Sep 24 14:48:32 2007 +++ cyrus-imapd.patched/master/master.h Thu Aug 21 15:28:07 2008 @@ -5,6 +5,7 @@ #include config.h #include sys/resource.h /* for rlim_t */ +#include sys/types.h /* for mode_t */ #include libconfig.h /* for config_dir and IMAPOPT_SYNC_MACHINEID */ @@ -12,10 +13,13 @@ struct service { char *name; /* name of service */ char *listen; /* port/socket to listen to */ +char *socket_owner; /* Owner of the socket if listen is a UNIX socket */ +char *socket_group; /* Group of the socket if listen is a UNIX socket */ +mode_t socket_mode; /* File mode of the socket if listen is a UNIX socket */ char *proto; /* protocol to accept */ char *const *exec; /* command (with args) to