Ive done more than 50 multi tb migrations. 

Pretty much create a tar and use rsync with tweaked encryption settings to 
speed up the transfer.

Nowdays we just do a zfs sync of the pool (disk storage) to the new servers 
zfs, it took a little over 7 hours to transfer 20tbytes 


_________________ eXtremeSHOK.com _________________

> On 10 Nov 2017, at 15:57, Federico SECURITY LINE <[email protected]> 
> wrote:
> 
> Why not make a tarball of all the orig dir, copy the .tgz file in the dest, 
> untar, tha rsync to sync the file that meanwhile changed?
> 
> Thanks,
> F.
> 
> 
> Il 10 Novembre 2017 11:36:12 CET, [email protected] ha scritto:
>> 
>> 
>> 
>> Dear piler users,
>> 
>> I had the pleasure (so to speak) of participating in a
>> piler migration project from hostA to hostB. Both hosts
>> are on the same datacenter, the network bandwidth is unknown
>> to me, but we may assume it's Gbit.
>> 
>> There was 2+ TB data to migrate, millions of files to copy.
>> 
>> Sftp was chosen as the copying method, and I can tell you
>> that copying the /var/piler/store dirs and files was taking
>> several days (not 2-3, rather many more).
>> 
>> So my conclusion is that using sftp (or even rsync, I believe)
>> is painful to migrate a piler archive to another host because
>> of the lots of small files.
>> 
>> I've been thinking how to make such possible future migrations
>> both easier and faster. I think such a migration would be less
>> painful if the data in /var/piler/store/00/... dirs were not be
>> in lots of files.
>> 
>> One possible solution is to use sqlite3 files, read on.
>> 
>> You know that the top level dirs in /var/piler/store/00 hold
>> ~12 days of data. After that it doesn't change, piler start writing
>> new emails to the next directory, nowadays it's 5a0. So what if we
>> could move all files in 59f to 59f.sdb, all files in 59e dir to 59e.sdb,
>> etc?
>> 
>> Then after 4 years you may end up with 4 * 365 / 12 =~ 120 big sdb 
>> files,
>> and the latest top level dir with the lots of small files (though much 
>> fewer
>> files compared to files in the 120 top level dirs).
>> 
>> So the sdb files would be big, but copying 120 large files to another 
>> host is
>> way much easier than 15 million smaller files.
>> 
>> OK, now the question is how to move data to these sdb files? The plan is 
>> to
>> create a utility to iterate through the top level dirs mentioned before, 
>> and
>> write the file contents to sqlite3 db files. Finally remove the .m and 
>> .a* files
>> successfully copied.
>> 
>> The only performance penalty (after writing the sdb files) comes to my 
>> mind is
>> that pilerget must first open the sdb file, and if it's not present 
>> (either
>> someone is not interested in this [optional] consolidation or this is 
>> the last
>> top level dir which has not been consolidated yet), then get the file 
>> from the
>> filesystem.
>> 
>> 
>> Another possible solution is to put all email data blob to mysql. There 
>> would
>> be a nice table, eg. maildata with 2-3 columns and the last column as a 
>> huge blob.
>> In this case instead of having 2-3 TB (in the case I mentioned) data and 
>> several
>> million files, you would have a very large mysql table with varying 
>> sizes of blob
>> data in each rows.
>> I'm not sure if it's a good idea. In this case instead of using 
>> mysqldump it would
>> be much easier to stop mysqld and copy the raw db file to the other 
>> host.
>> 
>> 
>> Before moving to either path I'd like to hear your comments, ideas on 
>> the topic.
>> 
>> Note: using sqlite data files is intended to be optional, not forcing 
>> anyone to
>> make this step at all. However, I believe it would be great allowing you 
>> to
>> migrate piler easier than today.
>> 
>> 
>> Janos
>> 
>> PS: Perhaps I introduce some bias with the following info, some of you 
>> may already know:
>> mailarchiva uses 1024 (or so) zip files to hold the encrypted files.
>> 
> 
> -- 
> Inviato dal mio dispositivo Android con K-9 Mail. Perdonate la brevità.

Reply via email to