Thanks,  Raul.

A naive implementation of the forward transform seems quite fast for text
up to about 20000 bytes on my machine.

On the other hand,  the crude reverse transform is disastrous (eg 12 
seconds
for 1000 chars),  so that seems to be where work is needed under J unless
you're looking to compress long articles or whole books!

Wiki points at Burrows and Wheeler's article
http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf
which has plenty of discussion of optimising the forward and reverse
algorithms.

Mike

On 08/09/2011 2:17 PM, Raul Miller wrote:
> http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform
>
> The Burrows Wheeler transform is a reversible transform on text which
> makes strings where some characters occur much more frequently than
> others easier to compress.  See the wikipedia url for more details.
>
> Here's an inefficient but straightforward implementation, which
> assumes that EAV is not contained in the string:
>
>     bwt=:{:"1@/:~@(|."0 1~ i.@-@#)@(,&EAV)
>     ibwt=: [: }.@(|.~ i.&EAV)@{. (,. /:~)^:(#@] - 1:)~
>
> And, here is the wikipedia example:
>
>     bwt '^BANANA'
> BNN^AA�A
>     ibwt bwt '^BANANA'
> ^BANANA
>
> Note that a more robust implementation would have an index rather than
> EAV to mark end-of-string location (which is necessary for reversal to
> succeed), so the result should be a pair of boxes (containing an index
> and an string).
>
> Building more efficient implementations might interest some of you?
> (I'll probably be trying this, myself, when I have some free time.)
>

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to