Why not make a tarball of all the orig dir, copy the .tgz file in the dest, untar, tha rsync to sync the file that meanwhile changed?
Thanks, F. Il 10 Novembre 2017 11:36:12 CET, [email protected] ha scritto: > > >Dear piler users, > >I had the pleasure (so to speak) of participating in a >piler migration project from hostA to hostB. Both hosts >are on the same datacenter, the network bandwidth is unknown >to me, but we may assume it's Gbit. > >There was 2+ TB data to migrate, millions of files to copy. > >Sftp was chosen as the copying method, and I can tell you >that copying the /var/piler/store dirs and files was taking >several days (not 2-3, rather many more). > >So my conclusion is that using sftp (or even rsync, I believe) >is painful to migrate a piler archive to another host because >of the lots of small files. > >I've been thinking how to make such possible future migrations >both easier and faster. I think such a migration would be less >painful if the data in /var/piler/store/00/... dirs were not be >in lots of files. > >One possible solution is to use sqlite3 files, read on. > >You know that the top level dirs in /var/piler/store/00 hold >~12 days of data. After that it doesn't change, piler start writing >new emails to the next directory, nowadays it's 5a0. So what if we >could move all files in 59f to 59f.sdb, all files in 59e dir to >59e.sdb, >etc? > >Then after 4 years you may end up with 4 * 365 / 12 =~ 120 big sdb >files, >and the latest top level dir with the lots of small files (though much >fewer >files compared to files in the 120 top level dirs). > >So the sdb files would be big, but copying 120 large files to another >host is >way much easier than 15 million smaller files. > >OK, now the question is how to move data to these sdb files? The plan >is >to >create a utility to iterate through the top level dirs mentioned >before, >and >write the file contents to sqlite3 db files. Finally remove the .m and >.a* files >successfully copied. > >The only performance penalty (after writing the sdb files) comes to my >mind is >that pilerget must first open the sdb file, and if it's not present >(either >someone is not interested in this [optional] consolidation or this is >the last >top level dir which has not been consolidated yet), then get the file >from the >filesystem. > > >Another possible solution is to put all email data blob to mysql. There > >would >be a nice table, eg. maildata with 2-3 columns and the last column as a > >huge blob. >In this case instead of having 2-3 TB (in the case I mentioned) data >and >several >million files, you would have a very large mysql table with varying >sizes of blob >data in each rows. >I'm not sure if it's a good idea. In this case instead of using >mysqldump it would >be much easier to stop mysqld and copy the raw db file to the other >host. > > >Before moving to either path I'd like to hear your comments, ideas on >the topic. > >Note: using sqlite data files is intended to be optional, not forcing >anyone to >make this step at all. However, I believe it would be great allowing >you >to >migrate piler easier than today. > > >Janos > >PS: Perhaps I introduce some bias with the following info, some of you >may already know: >mailarchiva uses 1024 (or so) zip files to hold the encrypted files. -- Inviato dal mio dispositivo Android con K-9 Mail. Perdonate la brevità .
