Steven, this work to speed up LZJB is great.  I look forward to seeing it
in illumos & all platforms.  I just have a few general comments:

Personally, I don't really care about performance on 32-bit platforms.  So
I'd prefer to simplify the code by just having the 64-bit optimized
version, and letting the compiler do its best with uint64_t's on 32-bit.
 But I'm not sure if anyone else cares about this level of performance on
32-bit, though.

The code should be formatted like the rest of the ZFS code.  The "cstyle"
program on illumos can check some of this. The main thing I noticed is the
indentation, which should be one tab per level (displayed as 8 spaces).

--matt


On Sat, Oct 26, 2013 at 11:49 AM, Strontium <[email protected]> wrote:

> Update:
> I have updated my source at https://github.com/stevenj/lzjbbench
>
> I have cleaned up the source of the new method considerably.  I now no
> longer think of it as "hacky".
>
> It lives in its own file:
> https://github.com/stevenj/lzjbbench/blob/master/lzjb_fast.c
>
> The latest version seems at least 20% faster than the previous version.
>  It is Pure C, with -O2 optimization, BUT NO MMX, and NO SSE.
>
> The only instruction "tweak" i use is the GCC builtin "__builtin_ctz"
> which improves performance by a couple of percentage points.  But there is
> a pure C fallback which is still very fast should it be unavailable at
> compile time.
>
> It "Should" work on Big Endian and 32 bit architectures.  BUT I have not
> tested it on these.
>
> The code has been written with the intention of it integrating easily into
> ZFS, and so, it includes ZFS headers and uses ZFS types.  As far as I can
> tell it should plug straight in with very little effort.
>
> On 64 Bit little endian, it has been run through extensive data tests
> using the compression corpuses: Calgary, Canterbury, Silesia and enwik8.
>  It can successfully decode lzjb compressed versions of all these files.  I
> am not aware of any data decoding issues.  I also have produced customized
> test data to exercise the RLE Optimization as it has many possible code
> paths, each path has been fully covered by testing.
>
> The LZJB bitstream does not lend itself to high speed extraction, I
> believe without resorting to assembler and extended instruction sets there
> is little further room for improvement.  An improved LZJB bitstream
> generator would in theory allow decompression to speed up, however at the
> moment the only LZJB compressor is the stock one from ZFS.
>
> I am currently running an exhaustive test run.  Full results will follow
> in a follow up post tomorrow.
>
> Subject to testing on 32bit and big endian architectures, or any
> unexpected results from the current test run, I believe this improved
> compressor is complete and is ready for wider testing.
>
> Steven Johnson
>
>  To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to