Kern Sibbald <k...@sibbald.com> wrote on 2010/07/24 16:46:58: > > On Saturday 24 July 2010 16:22:54 Joakim Tjernlund wrote: > > Kern Sibbald <k...@sibbald.com> wrote on 2010/07/24 14:21:46: > > > On Saturday 24 July 2010 14:01:59 Joakim Tjernlund wrote: > > > > Joakim Tjernlund <joakim.tjernl...@transmode.se> wrote on 2010/07/20 > > > > > > 20:02:53: > > > > > From: Joakim Tjernlund <joakim.tjernl...@transmode.se> > > > > > To: bacula-devel@lists.sourceforge.net > > > > > Cc: Joakim Tjernlund <joakim.tjernl...@transmode.se> > > > > > Date: 2010/07/20 20:24 > > > > > Subject: [Bacula-devel] [PATCH 1/2] Initial crc32 optimization. > > > > > > > > > > This is a strait port of the Linux kernel crc32 optimization > > > > > I did a while ago. This impl. is several times faster than the one it > > > > > replaces. Adapted to bacula usage. > > > > > --- > > > > > > > > > > I have not tested this yet as I went/is on vacation but I figured > > > > > perhaps someone else is interrested enough to do that :) > > > > > > > > > > bacula/src/lib/crc32.c | 386 > > > > > ++++++++++++++++++++++++++++++++++++++++-------- 1 files changed, 325 > > > > > insertions(+), 61 deletions(-) > > > > > > > > So it has been a few days and I was hoping for one or two comments by > > > > now. > > > > > > I have been terribly busy with *big* problems and past due deadlines. > > > > > > > One thing I have noticed is that Bacula seems to rely on dynamic > > > > endian detection instead of detecting endian at configure/compile time. > > > > > > Yes. > > > > OK. > > > > > > Is this so and does it need to stay this way? > > > > > > I am not sure. A Mac developer was trying to create a single binary that > > > would run on multiple architectures, which is a good idea if it will > > > work. I don't know whether he actually uses it or not. > > > > ehh, how can a single binary run on multiple archs? > > If all platforms use autconf I think we can let autoconf > > detect the hosts endian, see AC_C_BIGENDIAN > > > > We need to figure this one out before I can move on. If it needs > > to be dynamic then I must add an init crc32 table routine that > > runs "tole" on the crc32 table. Where should this init routine > > be called from? > > For the moment, to keep your life simple, I suggest you assume that the > dynamic endian detection is not critical. If it is we can find out when it > is necessary, and deal with it by either calling your routine or another one > or by generating your tables in the two endian manners in each execution or > something. > > Hopefully, you can proceed without that blocking you and we deal with it > later.
OK, lets leave it for now. > > > > > > > Currently the new crc32 code needs compile time detection of endianess > > > > and I like to keep it that way. Furthermore I suspect I won't be able > > > > to do much more Bacula work as vacation ends in one week and work at my > > > > dayjob is piling up. > > > > > > The first thing to do is to fill out an FLA and send it in. See > > > www.bacula.org -> FLA License > > > > I rather not, lets just call this "a few lines of code" which won't > > require a FLA. As you can see I have not claimed any CopyRight in the > > patch I sent. > > If you release it into the public domain or put a 3 clause BSD copyright on it > or LGPL or something like that I will not need the FLA, but I need either > public domain (explicit) or some simply copyright to not require the FLA. I see, how about GPLv2 then, same as Linux ? The I could just import test code and table code from there. > > > > > > Then the next step is to ensure that your new code and the old code can > > > be run in unit test and can read files for CRC from STDIN. Then we can > > > compare the > > > > Which unit tests? I cannot find any(note that I have never touched Bacula > > code until now so I might have missed it) > > Most my little routines like that have a #define TEST_PROGRAM at the end that > create a main(), ... and when you compile it into a binary, it tests itself, > in some very primitive way. It looks like this one doesn't have such a > routine. Take a look at src/lib/edit.c for a really simple case, or alist.c, > or scan.c ... > > Actually, the code in src/lib/md5.c may be very close to what you want to > change to do what I ask. OK, will have a look. > > > > > > results of the old code with the new code and the speed. Based on those > > > results we can see if we need to dynamically detect which algorithm to > > > use, and if yours is not identical, and is faster, then we would need to > > > modify the Volume format to account for this and do the dynamic detection > > > based on the volume format. All future versions of Bacula *must* be able > > > to correctly read older Volume versions. > > > > Not sure I follow you here but it seems like you thing that the new crc32 > > impl. is incompatible with the current one? This is not so, the end result, > > the calculated crc32, is the same(barring any impl. bugs) as the old one. > > OK, I didn't re-read your previous email, but it seems to me you said you were > not sure. In any case, before I can replace the current algorithm, I must I am not sure I didn't made a mistake in porting the code from Linux into Bacula. The general idea has been proven for some time. > prove to myself (or much faster would be you prove to me) by seeing the > output that it is the same. I suggest something like the md5.c or sha1.c > programs where it reads a file and creates the hash -- or in your case the > CRC. > > > > > There won't be any need to test which one to use, this impl. is much > > faster than the current one for all archs(unless you want to crc one byte > > at a time which the Bacula bcrc32 isn't designed for) > > I am all for much faster code. The current CRC takes a good deal of time, and > I recently added the ability to disable it, but I would really like something > faster so that no one disables it :-) Yes, I noticed this 10 years ago when we started to use the JFFS2 file system in Linux. JFFS2 crc32 all its data and it took time. At the time the Linux crc32 impl. was like yours. I then optimized it significantly and sent it in and all was well. Recently I did another optimization pass and it was twice as fast on x86 than the previous one. OT, I am pushing our IT here to use Bacula but there was one thing that Bacula could not do(or so they think) that we need. We backup to disk for easy access when we need to restore some file. then we transfer on disk backup to tape for safe and long time storage. The on disk backups gets recycled after some time after which we only have the tape copy left. How can we do this in Bacula and make Bacula understand where to look for files, either on disk if it is still there or on the corresponding tape? Jocke ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel