Here's a funny one. I've recreated it as a simple testcase which I'll paste below. Basically, a message with invalid mime structure causes cyrus to put the wrong "size" information in its headers.
Seems some spammers have been generating these, and they show up as replication errors because the index size doesn't match the file size. [br...@imap3 hm]$ cat /mnt/data8/slot308/store23/data/b/user/brong/390978. Return-Path: <[email protected]> Received: from compute2.internal (compute2.internal [10.202.2.42]) by store23m.internal (Cyrus v2.3.14-fmsvn18904-c7f26adc) with LMTPA; Wed, 24 Jun 2009 21:53:09 -0400 X-Sieve: CMU Sieve 2.3 X-Spam-score: 1.4 X-Spam-hits: BAYES_20 -0.74, MISSING_MID 0.001, NO_RECEIVED -0.001, NO_RELAYS -0.001, TVD_SPACE_RATIO 2.219, BAYES_USED user X-Spam-source: IP='127.0.0.1', Host='unk', Country='unk', FromHeader='fm', MailFrom='fm' X-Spam-charsets: X-Attached: ForwardedMessage X-Resolved-to: [email protected] X-Mail-from: [email protected] Received: from test ([10.202.2.231]) by compute2.internal (LMTPProxy); Wed, 24 Jun 2009 21:53:08 -0400 Date: 20 Jun 2009 07:21:45 -0000 MIME-Version: 1.0 To: [email protected] Subject: bogusmessage From: [email protected] Content-Type: multipart/mixed; boundary="=_31ff156115c676d4fc4fe82130032447" Message-ID: <[email protected]> --=_31ff156115c676d4fc4fe82130032447 Content-Transfer-Encoding: Content-Type: message/rfc822; name="ForwardedMessage"; Content-Disposition: inline; filename="ForwardedMessage"; --=_31ff156115c676d4fc4fe82130032447-- [br...@imap3 hm]$ ls -la /mnt/data8/slot308/store23/data/b/user/brong/390978. -rw------- 1 cyrus mail 1189 Jun 24 21:53 /mnt/data8/slot308/store23/data/b/user/brong/390978. [br...@imap3 hm]$ utils/oneoff/index_uids.pl -u 390978 -D /mnt/meta8/slot308/store23/meta/b/user/brong/cyrus.index Uid: 390978 InternalDate: 1245894789 SentDate: 1245513600 Size: 1147 HeaderSize: 961 ContentOffset: 961 CacheOffset: 1066472 LastUpdated: 1245894810 SystemFlags: 00000000000000000000000000000000 UserFlags: 00000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 ContentLines: 5 CacheVersion: 2 MessageGuid: a8c26e46c4ce83fb5d77d360f024e3bbaa8d7371 Modseq: 14869 ======================= So, the file on disk is 1189 bytes long, but the cyrus.index says the size is 1147 bytes. The reason for this is that cyrus builds the bodystructure and calculates the size of all the component parts rather than just using the actual file size. I guess my question is - is there any reason not to just put the actual size-in-bytes of the file into the index header record? Envelope parsing might be slightly messed up, but at least the basics will be OK. Bron.
