Re: Did calculating the quota change from 2.3 to 2.5?

2016-12-31 Thread Wolfgang Breyha via Info-cyrus
On 31/12/16 06:17, Bron Gondwana via Info-cyrus wrote:

> If your cyrus.* files are identical then you have pretty weird mailboxes,
> but yeah - I guess it could happen if you had two folders with identical
> messages in identical order and all the timestamps identical.

Especially empty mailbox cyrus.cache/squat and most cyrus.annotation (2.5)
files got linked together on my testhost.

I use freedup with the patch sent in
https://mid.mail-archive.com/info-cyrus@lists.andrew.cmu.edu/msg42850.html
on regular basis without harm.

Meanwhile the patch is included in the most recent 1.6-3 release of freedup.

Greetings, Wolfgang
-- 
Wolfgang Breyha  | http://www.blafasel.at/
Vienna University Computer Center | Austria


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-12-30 Thread Bron Gondwana via Info-cyrus
On Sat, 31 Dec 2016, at 10:57, Wolfgang Breyha via Info-cyrus wrote:
> On 29/11/16 22:37, Jason L Tibbitts III via Info-cyrus wrote:
> > Fun random question: Does anything blow up if you run hardlink on your
> > mail spool?  (The hardlink program finds identical files and hardlinks
> > them.)
> 
> Using "hardlink" is IMO not save on imap spools since it also links cyrus.*
> files what's definitely not what you want.

If your cyrus.* files are identical then you have pretty weird mailboxes, but 
yeah - I guess it could happen if you had two folders with identical messages 
in identical order and all the timestamps identical.

> I recommend using freedup instead. Something like
> freedup -n -v -a -T -o '-name "*."' -l /var/spool/..

But yeah, that looks safe :)

It shouldn't blow anything up so long as it renames the file into place 
atomically.  Also be a little careful about file ownership, you'll want to run 
this as the cyrus user, so it retains ownership and access (unless freedup 
fixes up ownership, I've never tried it myself)

Bron.



-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-12-30 Thread Wolfgang Breyha via Info-cyrus
On 29/11/16 22:37, Jason L Tibbitts III via Info-cyrus wrote:
> Fun random question: Does anything blow up if you run hardlink on your
> mail spool?  (The hardlink program finds identical files and hardlinks
> them.)

Using "hardlink" is IMO not save on imap spools since it also links cyrus.*
files what's definitely not what you want.

I recommend using freedup instead. Something like
freedup -n -v -a -T -o '-name "*."' -l /var/spool/..

Greetings, Wolfgang
-- 
Wolfgang Breyha  | http://www.blafasel.at/
Vienna University Computer Center | Austria


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-30 Thread Adam Tauno Williams via Info-cyrus
> > If you use imapsync, it doesn't know about that, and will upload
> > the same message twice. 2.5 doesn't have the smarts to recognise
> > that it's the same message.
> imapsync can only sync mail the old server knows about. And in the
> end there is more quota used on the new server!?
> The only explanation is the quota on the old server is broken, isn't
> it?

No, IMAP doesn't know about deduplication;  so imapsync between two
servers dededuplicates.  imapsync may also repair damaged or missing
message headers - meaning the messages are no longer are the same - so
a tool like hardlinks will not return you to the same count in du as on
the old server.

And then there is the [virtuous] issue of delayed expunge.

-- 
Adam Tauno Williams  GPG D95ED383
Systems Administrator, Python Developer, LPI / NCLA

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-30 Thread Marc Patermann via Info-cyrus

Bron,

Am 29.11.2016 um 22:18 Uhr schrieb Bron Gondwana:

Quota is a sum of byte sizes of raw unexpunged messages. It doesn't
deduplicate. Likely issue is incorrect quota_mailbox_used in the
cyrus.index header on 2.3. a reconstruct will fix those, then quota
-f again.

Does not change anything.


It's not related to du.

The problem with imapsync is that it doesn't handle single instance
store. If you have copied messages or delivered then into multiple
mailboxes with sieve, they will have hard links on disk.

If you use imapsync, it doesn't know about that, and will upload the
same message twice. 2.5 doesn't have the smarts to recognise that
it's the same message.
imapsync can only sync mail the old server knows about. And in the end 
there is more quota used on the new server!?


The only explanation is the quota on the old server is broken, isn't it?

Marc

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-29 Thread Bron Gondwana via Info-cyrus
On Wed, 30 Nov 2016, at 08:37, Jason L Tibbitts III via Info-cyrus wrote:
> > "BG" == Bron Gondwana via Info-cyrus  
> > writes:
> 
> BG> If you use imapsync, it doesn't know about that, and will upload the
> BG> same message twice. 2.5 doesn't have the smarts to recognise that
> BG> it's the same message.
> 
> Fun random question: Does anything blow up if you run hardlink on your
> mail spool?  (The hardlink program finds identical files and hardlinks
> them.)

No, that is fine.

> Given an index of message-id/filenames it should be possible to write a
> deduplicator that's orders of magnitude faster than hardlink, but I have
> a sneaking suspicion that someone's already done that.

Yep, I wrote something which can read 2.5 cyrus.index files and hardlink
matching files.  It depends on a ton of FastMail internals though.

3.0 will have much better support for deduplication when you upload via
IMAP, because it will know where all the other copies in the same user are
(there's no support for cross-user deduplication because we don't use it at
all, every user gets their own sieve script and their own lmtp pre-processing
at FastMail, so every message will have different headers and hence be a
different GUID.  I have to prioritise designs that I actually use)


#!/usr/bin/perl -w

# SETUP {{{
use strict;
use warnings;
use ME;
use Date::Manip;
use IO::File;
use ME::Machine;
use Cyrus::HeaderFile;
use Data::Dumper;
use Cyrus::IndexFile;
use Getopt::Std;
use Digest::SHA;
use ME::CyrusBackup;
use ME::User;
use Data::Dumper;
# }}}

my $sn = shift;

my (undef,undef,$uid,$gid) = getpwnam('cyrus');

foreach my $Slot (ME::Machine->ImapSlots()) {
  next if ($sn and $sn ne $Slot->Name());
  my $users = $Slot->AllMailboxes();
  my $conf = $Slot->ImapdConf();
  foreach my $user (sort keys %$users) {
process($conf, $user, $users->{$user});
  }
}

sub process {
  my ($conf, $user, $folders) = @_;
  print "$user\n";
  my %ihave;
  foreach my $folder (@$folders) {
my $meta = $conf->GetUserLocation('meta', $user, 'default', $folder);
my $index = Cyrus::IndexFile->new_file("$meta/cyrus.index") || die "Failed 
to open $meta/cyrus.index";
while (my $record = $index->next_record()) {
  push @{$ihave{$record->{MessageGuid}}}, [$folder, $record->{Uid}];
}
  }

  foreach my $guid (keys %ihave) {
next if @{$ihave{$guid}} <= 1;
my ($inode, $srcname);
my @others;
foreach my $item (@{$ihave{$guid}}) {
  my $spool = $conf->GetUserLocation('spool', $user, 'default', $item->[0]);
  $spool =~ s{/$}{};
  my $file = "$spool/$item->[1].";
  my (@sd) = stat($file);
  if ($inode) {
next if $sd[1] == $inode;
push @others, $file;
  }
  else {
$inode = $sd[1];
$srcname = $file;
  }
}
next unless @others;
print "fixing up files for $guid ($srcname)\n";
foreach my $file (@others) {
  my $tmpfile = $file . "tmp";
  print "link error $tmpfile\n" unless link($srcname, $tmpfile);
  chown($uid, $gid, $tmpfile);
  chmod(0600, $tmpfile);
  print "rename error $file\n" unless rename($tmpfile, $file);
}
  }
}





-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-29 Thread Jason L Tibbitts III via Info-cyrus
> "BG" == Bron Gondwana via Info-cyrus  
> writes:

BG> If you use imapsync, it doesn't know about that, and will upload the
BG> same message twice. 2.5 doesn't have the smarts to recognise that
BG> it's the same message.

Fun random question: Does anything blow up if you run hardlink on your
mail spool?  (The hardlink program finds identical files and hardlinks
them.)

Given an index of message-id/filenames it should be possible to write a
deduplicator that's orders of magnitude faster than hardlink, but I have
a sneaking suspicion that someone's already done that.

 - J<

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-29 Thread Bron Gondwana via Info-cyrus
Quota is a sum of byte sizes of raw unexpunged messages. It doesn't 
deduplicate. Likely issue is incorrect quota_mailbox_used in the cyrus.index 
header on 2.3. a reconstruct will fix those, then quota -f again.

It's not related to du.

The problem with imapsync is that it doesn't handle single instance store. If 
you have copied messages or delivered then into multiple mailboxes with sieve, 
they will have hard links on disk.

If you use imapsync, it doesn't know about that, and will upload the same 
message twice. 2.5 doesn't have the smarts to recognise that it's the same 
message.

Bron.

On Wed, 30 Nov 2016, at 01:24, Marc Patermann via Info-cyrus wrote:
> Bron,
> 
> Am 29.11.2016 um 13:26 Uhr schrieb Bron Gondwana via Info-cyrus:
> > No, the quota calculations are identical.  It's possible that your
> > quota was incorrectly calculated on the source server though.  A
> > quota -f there should correct the calculations.
> unluckily it does not.
> 
> quota -f on seems not to be related to the du counter on the old server 
> in any way for some mailboxes.
> 
> First we create the mailbox on the new server and sync the quota.
> Then imapsync syncs the messages.
> Till the quota is exceeded …
> 
> oldserver> lq user.xxx
>   STORAGE 658949/125 (52.71592%)
> 
> # du -sh /var/lib/imap/meta/user/xxx/
> 105M/var/lib/imap/meta/user/xxx/
> # du -sh /var/spool/imap/user/xxx/
> 1,2G/var/spool/imap/user/xxx/
> 
> 
> newserver> lq user.xxx
>   STORAGE 1098788/125 (87.90304%)
> 
> # du -sh /var/spool/imap/user/xxx/
> 1,7G/var/spool/imap/user/xxx/
> 
> There is no separate meta partition on the new server.
> Meta data is about 500 MB now on the new server, this is about 5x the space.
> 
> I think quota is just plain wrong on the old server.
> 
> squatter file are huge in comparison now.
> Is this right?
> 
> 
> Marc
> 
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-29 Thread Marc Patermann via Info-cyrus

Bron,

Am 29.11.2016 um 13:26 Uhr schrieb Bron Gondwana via Info-cyrus:

No, the quota calculations are identical.  It's possible that your
quota was incorrectly calculated on the source server though.  A
quota -f there should correct the calculations.

unluckily it does not.

quota -f on seems not to be related to the du counter on the old server 
in any way for some mailboxes.


First we create the mailbox on the new server and sync the quota.
Then imapsync syncs the messages.
Till the quota is exceeded …

oldserver> lq user.xxx
 STORAGE 658949/125 (52.71592%)

# du -sh /var/lib/imap/meta/user/xxx/
105M/var/lib/imap/meta/user/xxx/
# du -sh /var/spool/imap/user/xxx/
1,2G/var/spool/imap/user/xxx/


newserver> lq user.xxx
 STORAGE 1098788/125 (87.90304%)

# du -sh /var/spool/imap/user/xxx/
1,7G/var/spool/imap/user/xxx/

There is no separate meta partition on the new server.
Meta data is about 500 MB now on the new server, this is about 5x the space.

I think quota is just plain wrong on the old server.

squatter file are huge in comparison now.
Is this right?


Marc

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Did calculating the quota change from 2.3 to 2.5?

2016-11-29 Thread Bron Gondwana via Info-cyrus
No, the quota calculations are identical.  It's possible that your quota was 
incorrectly calculated on the source server though.  A quota -f there should 
correct the calculations.

Regards,

Bron.

On Tue, 29 Nov 2016, at 01:36, Marc Patermann via Info-cyrus wrote:
> Hi,
> 
> while migrating from 2.3 to 2.5 (see my last post here), mailboxes can 
> not be synced, because the quota is exceeded on the new server.
> 
> A mailbox which has a du of about 800 MB in a 900 MB quota mailbox fills 
> the new mailbox by over 100%.
> 
> Are meta files now calculated into the quota?
> 
> 
> Marc
> 
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Did calculating the quota change from 2.3 to 2.5?

2016-11-28 Thread Marc Patermann via Info-cyrus

Hi,

while migrating from 2.3 to 2.5 (see my last post here), mailboxes can 
not be synced, because the quota is exceeded on the new server.


A mailbox which has a du of about 800 MB in a 900 MB quota mailbox fills 
the new mailbox by over 100%.


Are meta files now calculated into the quota?


Marc

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus