I think that the simpler version would be easier to optimize, since it
does not depend on sequence properties.

Also, the important qualities are 1) reversibility and 2) compressibility.

-- 
Raul

On Sat, Feb 2, 2013 at 10:43 PM, Henry Rich <[email protected]> wrote:
> I don't know jack about the BWT, but from the Wikipedia entry this looks
> like an implementation:
>
>    ({~ <:@/:) 'aBANANAz'
> BNNaAAzA
>
> That works for their testcase, but it doesn't match yours.  I think the
> difference is that you sort on all characters rather than just the first.
> Is this simpler version good enough?
>
> Henry Rich
>
>
> On 2/2/2013 10:08 PM, Raul Miller wrote:
>>
>> Here's a slightly flawed implementation of Burrows-Wheeler Transform
>> (which is useful when compressing text, because the result tends to be
>> easy to compress):
>>
>>     bwt0=: (_1 {"1 [: /:~ (|."0 1~ i.@#)) :. ibwt0
>>     bwt0 'this is a test'
>> ssa tt hiie ts
>>     ibwt0=: (_1 { ((,. /:~)^:(#@[) (#,0:) $ ''"_)) :. bwt0
>>     ibwt0 bwt0 'this is a test'
>> stthis is a te
>>
>> As you can see, the inverse does not quite work right -- it has been
>> rotated.
>>
>> Here's a fixed implementation:
>>
>>     bwt=: bwt0@,&EAV :.ibwt
>>     ibwt=: }.@(|.~ i.&EAV)@ibwt0 :.bwt
>>     ibwt bwt 'this is a test'
>> this is a test
>>
>> This fixed implementation might fail if the original data contained EAV.
>>
>> It seems like we ought to be able to implement the transform without
>> this flaw (perhaps using a boxed representation).
>>
>> Allowing arbitrary treatments of how the transformed result tracks the
>> original location of the beginning/end of text, do we have a more
>> efficient way of computing the inverse (or the transform)?
>>
>> Thanks,
>>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to