Vicent Martí <tan...@gmail.com> writes:

> On Tue, Jun 25, 2013 at 5:58 PM, Thomas Rast <tr...@inf.ethz.ch> wrote:
>>
>> Please document the RLW format here.
>
> Har har. I was going to comment on your review of the Ewah patchset,
> but might as well do it here: the only thing I know about Ewah bitmaps
> is that they work. And I know this because I did extensive fuzz
> testing of my C port. Unfortunately, the original Java code I ported
> from has 0 comments, so any documentation here would have to be
> reverse-engineered.

I think the below would be a reasonable documentation, to be appended
after your description of the EWAH format.  Maybe Colby can correct me
if I got anything wrong.  You can basically read this off from the
implementation of ewah_each_bit() and the helper functions it uses.

-- 8< --
The compressed bitmap is stored in a form of run-length encoding, as
follows.  It consists of a concatenation of an arbitrary number of
chunks.  Each chunk consists of one or more 64-bit words

     H  L_1  L_2  L_3 .... L_M

H is called RLW (run length word).  It consists of (from lower to higher
order bits):

     - 1 bit: the repeated bit B

     - 32 bits: repetition count K (unsigned)

     - 31 bits: literal word count M (unsigned)

The bitstream represented by the above chunk is then:

     - K repetitions of B

     - The bits stored in `L_1` through `L_M`.  Within a word, bits at
       lower order come earlier in the stream than those at higher
       order.

The next word after `L_M` (if any) must again be a RLW, for the next
chunk.  For efficient appending to the bitstream, the EWAH stores a
format to the last RLW in the stream.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to