Steven, this work to speed up LZJB is great. I look forward to seeing it in illumos & all platforms. I just have a few general comments:
Personally, I don't really care about performance on 32-bit platforms. So I'd prefer to simplify the code by just having the 64-bit optimized version, and letting the compiler do its best with uint64_t's on 32-bit. But I'm not sure if anyone else cares about this level of performance on 32-bit, though. The code should be formatted like the rest of the ZFS code. The "cstyle" program on illumos can check some of this. The main thing I noticed is the indentation, which should be one tab per level (displayed as 8 spaces). --matt On Sat, Oct 26, 2013 at 11:49 AM, Strontium <[email protected]> wrote: > Update: > I have updated my source at https://github.com/stevenj/lzjbbench > > I have cleaned up the source of the new method considerably. I now no > longer think of it as "hacky". > > It lives in its own file: > https://github.com/stevenj/lzjbbench/blob/master/lzjb_fast.c > > The latest version seems at least 20% faster than the previous version. > It is Pure C, with -O2 optimization, BUT NO MMX, and NO SSE. > > The only instruction "tweak" i use is the GCC builtin "__builtin_ctz" > which improves performance by a couple of percentage points. But there is > a pure C fallback which is still very fast should it be unavailable at > compile time. > > It "Should" work on Big Endian and 32 bit architectures. BUT I have not > tested it on these. > > The code has been written with the intention of it integrating easily into > ZFS, and so, it includes ZFS headers and uses ZFS types. As far as I can > tell it should plug straight in with very little effort. > > On 64 Bit little endian, it has been run through extensive data tests > using the compression corpuses: Calgary, Canterbury, Silesia and enwik8. > It can successfully decode lzjb compressed versions of all these files. I > am not aware of any data decoding issues. I also have produced customized > test data to exercise the RLE Optimization as it has many possible code > paths, each path has been fully covered by testing. > > The LZJB bitstream does not lend itself to high speed extraction, I > believe without resorting to assembler and extended instruction sets there > is little further room for improvement. An improved LZJB bitstream > generator would in theory allow decompression to speed up, however at the > moment the only LZJB compressor is the stock one from ZFS. > > I am currently running an exhaustive test run. Full results will follow > in a follow up post tomorrow. > > Subject to testing on 32bit and big endian architectures, or any > unexpected results from the current test run, I believe this improved > compressor is complete and is ready for wider testing. > > Steven Johnson > > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
