I don't know jack about the BWT, but from the Wikipedia entry this looks
like an implementation:
({~ <:@/:) 'aBANANAz'
BNNaAAzA
That works for their testcase, but it doesn't match yours. I think the
difference is that you sort on all characters rather than just the
first. Is this simpler version good enough?
Henry Rich
On 2/2/2013 10:08 PM, Raul Miller wrote:
Here's a slightly flawed implementation of Burrows-Wheeler Transform
(which is useful when compressing text, because the result tends to be
easy to compress):
bwt0=: (_1 {"1 [: /:~ (|."0 1~ i.@#)) :. ibwt0
bwt0 'this is a test'
ssa tt hiie ts
ibwt0=: (_1 { ((,. /:~)^:(#@[) (#,0:) $ ''"_)) :. bwt0
ibwt0 bwt0 'this is a test'
stthis is a te
As you can see, the inverse does not quite work right -- it has been rotated.
Here's a fixed implementation:
bwt=: bwt0@,&EAV :.ibwt
ibwt=: }.@(|.~ i.&EAV)@ibwt0 :.bwt
ibwt bwt 'this is a test'
this is a test
This fixed implementation might fail if the original data contained EAV.
It seems like we ought to be able to implement the transform without
this flaw (perhaps using a boxed representation).
Allowing arbitrary treatments of how the transformed result tracks the
original location of the beginning/end of text, do we have a more
efficient way of computing the inverse (or the transform)?
Thanks,
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm