Oops, sorry about that.
Try it this way:
deflatecodes2=:4 :0
L=. #@> x hcodes y
U=. 0,~.L
R=. ;<@(({. >./@(>#])&U #1:)@{. <@:+"1-@{.{."1 #:@i.@#)/.~L
R/:;(</. i.@#)L
)
":@>F1 deflatecodes2 A1
0
1 1 0
1 1 1 1 1 1 0
1 1 1 1 1 1 1
1 1 1 0 0 0
1 1 1 0 0 1
1 1 1 0 1 0
1 1 1 0 1 1
1 1 1 1 0 0
1 1 1 1 0 1
1 1 1 1 1 0
1 0
Thanks,
--
Raul
On Wed, Sep 10, 2014 at 7:26 PM, bill lam <[email protected]> wrote:
> for zlib, all huffman code of the same bit length are in sequential,
> starting with a code an 0 bit in the end (even number), and no gaps in the
> block (consecutive). the symbols of these block represented are sequential
> but not consecutive (possibly gaps). So the zlib huffman code is slightly
> less efficient than the original huffman code but the advantage is simpler
> to store the table, just the bits used. section 3.2.6 gives an example of
> such table.
>
> deflatecodes can satisfy the 2 rules _but_ its result is invalid because
> huffman code is a prefix coding.
>
> ,.F1 ( deflatecodes) A1
> ┌─────────────┐
> │0 │
> ├─────────────┤
> │1 1 0 │
> ├─────────────┤
> │1 1 1 0 0 1 0│
> ├─────────────┤
> │1 1 1 0 0 1 1│
> ├─────────────┤
> │1 1 1 0 0 0 │
> ├─────────────┤
> │1 1 1 0 0 1 │
> ├─────────────┤
> │1 1 1 0 1 0 │
> ├─────────────┤
> │1 1 1 0 1 1 │
> ├─────────────┤
> │1 1 1 1 0 0 │
> ├─────────────┤
> │1 1 1 1 0 1 │
> ├─────────────┤
> │1 1 1 1 1 0 │
> ├─────────────┤
> │1 0 │
> └─────────────┘
> 7 bit 1110010 is a prefix of 6 bit 111001 and this is illegal.
> Instead 7 bit code block should start with
> #.inv 2b11100 + 7
> 1 1 1 1 1 1 0
> because there are 6 symbols in the 6 bit code. (algorithm given in section
> 3.2.2).
>
> Any idea to fix deflatecodes so that it can give valid huffman codes?
> Thanks.
> On Sep 11, 2014 1:01 AM, "Raul Miller" <[email protected]> wrote:
>
>> I think the use of the term "consecutive" rather than "sequential" is
>> telling.
>>
>> The described algorithm is: compute the huffman code lengths:
>> #@>F1 hcodes A1
>> 1 3 7 7 6 6 6 6 6 6 6 2
>>
>> Then assign ascending huffman codes first in length order and then
>> within codes of the same length.
>>
>> Taken literally, that might be something like this:
>>
>> H=: 4 :0
>> L=.#@> x hcodes y
>> U=.~.L
>> ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L
>> )
>>
>> ":@>F1 H A1
>> 0
>> 1 1 0
>> 1 1 1 0 0 1 0
>> 1 1 1 0 0 1 1
>> 1 1 1 0 0 0
>> 1 1 1 0 0 1
>> 1 1 1 0 1 0
>> 1 1 1 0 1 1
>> 1 1 1 1 0 0
>> 1 1 1 1 0 1
>> 1 1 1 1 1 0
>> 1 0
>>
>> But is this correct? Is it actually safe to leave the results like
>> this - with all codes of the same length being consecutive to each
>> other?
>>
>> F (hcodes -:&:(#@>) H) A
>> 0
>>
>> No.
>>
>> So... "consecutive" must refer only to the values used and not their
>> order within the result.
>>
>> Perhaps something like this:
>>
>> deflatecodes=:4 :0
>> L=.#@> x hcodes y
>> U=.~.L
>> R=. ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L
>> R/:;(</. i.@#)L
>> )
>>
>> F (hcodes -:&:(#@>) deflatecodes) A
>> 1
>>
>> There should be a better way of doing this, but this should at least
>> get you started.
>>
>> Thanks,
>>
>> --
>> Raul
>>
>>
>> On Wed, Sep 10, 2014 at 10:45 AM, bill lam <[email protected]> wrote:
>> > For huffman coding used in zlib:
>> > https://www.ietf.org/rfc/rfc1951.txt section 3.2.2.
>> >
>> > The Huffman codes used for each alphabet in the "deflate"
>> > format have two additional rules:
>> >
>> > * All codes of a given bit length have lexicographically
>> > consecutive values, in the same order as the symbols
>> > they represent;
>> >
>> > * Shorter codes lexicographically precede longer codes.
>> > I tried jwiki hcodes in
>> > I try Roger's essay
>> > http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding
>> >
>> > hc=: 4 : 0
>> > if. 1=#x do. y
>> > else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end.
>> > )
>> >
>> > hcodes=: 4 : 0
>> > assert. x -:&$ y NB. weights and words have same shape
>> > assert. (0<:x) *. 1=#$x NB. weights are non-negative
>> > assert. 1 >: L.y NB. words are boxed not more than once
>> > w=. ,&.> y NB. standardized words
>> > assert. w -: ~.w NB. words are unique
>> > t=. 0 {:: x hc w NB. minimal weight binary tree
>> > ((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t
>> > )
>> >
>> > but the coding produced is malformed for zlib. eg,
>> > this is what I ran into trouble
>> >
>> > f1=: 1 256 17 1 1 9 1
>> > f2=: 2 1 0 1 255 0 1536
>> > F=: ,/(f1#f2)
>> > A=: i.286
>> >
>> > F hcodes A
>> >
>> > Or a shorter example
>> >
>> > A1=: i.12
>> > F1=: 2 1 0 0 0 0 0 0 0 0 0 1
>> >
>> > F1 hcodes A1
>> >
>> > Any idea?
>> >
>> > --
>> > regards,
>> > ====================================================
>> > GPG key 1024D/4434BAB3 2008-08-24
>> > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
>> > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
>> > ----------------------------------------------------------------------
>> > For information about J forums see http://www.jsoftware.com/forums.htm
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm