The forward transform can be implemented as: bwt =: (/: 1&|.)@:(,&EAV) bwt '^BANANA' BNN^AA�A
The backwards transform is likely to take quite a bit more work. Marshall On Thu, Sep 8, 2011 at 11:27 AM, Mike Day <[email protected]>wrote: > Thanks, Raul. > > A naive implementation of the forward transform seems quite fast for text > up to about 20000 bytes on my machine. > > On the other hand, the crude reverse transform is disastrous (eg 12 > seconds > for 1000 chars), so that seems to be where work is needed under J unless > you're looking to compress long articles or whole books! > > Wiki points at Burrows and Wheeler's article > http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf > which has plenty of discussion of optimising the forward and reverse > algorithms. > > Mike > > On 08/09/2011 2:17 PM, Raul Miller wrote: > > http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform > > > > The Burrows Wheeler transform is a reversible transform on text which > > makes strings where some characters occur much more frequently than > > others easier to compress. See the wikipedia url for more details. > > > > Here's an inefficient but straightforward implementation, which > > assumes that EAV is not contained in the string: > > > > bwt=:{:"1@/:~@(|."0 1~ i.@-@#)@(,&EAV) > > ibwt=: [: }.@(|.~ i.&EAV)@{. (,. /:~)^:(#@] - 1:)~ > > > > And, here is the wikipedia example: > > > > bwt '^BANANA' > > BNN^AA�A > > ibwt bwt '^BANANA' > > ^BANANA > > > > Note that a more robust implementation would have an index rather than > > EAV to mark end-of-string location (which is necessary for reversal to > > succeed), so the result should be a pair of boxes (containing an index > > and an string). > > > > Building more efficient implementations might interest some of you? > > (I'll probably be trying this, myself, when I have some free time.) > > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
