sorry typo, because there are _7_ symbols in the 6 bit code. On Sep 11, 2014 7:26 AM, "bill lam" <[email protected]> wrote:
> for zlib, all huffman code of the same bit length are in sequential, > starting with a code an 0 bit in the end (even number), and no gaps in the > block (consecutive). the symbols of these block represented are sequential > but not consecutive (possibly gaps). So the zlib huffman code is slightly > less efficient than the original huffman code but the advantage is simpler > to store the table, just the bits used. section 3.2.6 gives an example of > such table. > > deflatecodes can satisfy the 2 rules _but_ its result is invalid because > huffman code is a prefix coding. > > ,.F1 ( deflatecodes) A1 > ┌─────────────┐ > │0 │ > ├─────────────┤ > │1 1 0 │ > ├─────────────┤ > │1 1 1 0 0 1 0│ > ├─────────────┤ > │1 1 1 0 0 1 1│ > ├─────────────┤ > │1 1 1 0 0 0 │ > ├─────────────┤ > │1 1 1 0 0 1 │ > ├─────────────┤ > │1 1 1 0 1 0 │ > ├─────────────┤ > │1 1 1 0 1 1 │ > ├─────────────┤ > │1 1 1 1 0 0 │ > ├─────────────┤ > │1 1 1 1 0 1 │ > ├─────────────┤ > │1 1 1 1 1 0 │ > ├─────────────┤ > │1 0 │ > └─────────────┘ > 7 bit 1110010 is a prefix of 6 bit 111001 and this is illegal. > Instead 7 bit code block should start with > #.inv 2b11100 + 7 > 1 1 1 1 1 1 0 > because there are 6 symbols in the 6 bit code. (algorithm given in > section 3.2.2). > > Any idea to fix deflatecodes so that it can give valid huffman codes? > Thanks. > On Sep 11, 2014 1:01 AM, "Raul Miller" <[email protected]> wrote: > >> I think the use of the term "consecutive" rather than "sequential" is >> telling. >> >> The described algorithm is: compute the huffman code lengths: >> #@>F1 hcodes A1 >> 1 3 7 7 6 6 6 6 6 6 6 2 >> >> Then assign ascending huffman codes first in length order and then >> within codes of the same length. >> >> Taken literally, that might be something like this: >> >> H=: 4 :0 >> L=.#@> x hcodes y >> U=.~.L >> ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L >> ) >> >> ":@>F1 H A1 >> 0 >> 1 1 0 >> 1 1 1 0 0 1 0 >> 1 1 1 0 0 1 1 >> 1 1 1 0 0 0 >> 1 1 1 0 0 1 >> 1 1 1 0 1 0 >> 1 1 1 0 1 1 >> 1 1 1 1 0 0 >> 1 1 1 1 0 1 >> 1 1 1 1 1 0 >> 1 0 >> >> But is this correct? Is it actually safe to leave the results like >> this - with all codes of the same length being consecutive to each >> other? >> >> F (hcodes -:&:(#@>) H) A >> 0 >> >> No. >> >> So... "consecutive" must refer only to the values used and not their >> order within the result. >> >> Perhaps something like this: >> >> deflatecodes=:4 :0 >> L=.#@> x hcodes y >> U=.~.L >> R=. ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L >> R/:;(</. i.@#)L >> ) >> >> F (hcodes -:&:(#@>) deflatecodes) A >> 1 >> >> There should be a better way of doing this, but this should at least >> get you started. >> >> Thanks, >> >> -- >> Raul >> >> >> On Wed, Sep 10, 2014 at 10:45 AM, bill lam <[email protected]> wrote: >> > For huffman coding used in zlib: >> > https://www.ietf.org/rfc/rfc1951.txt section 3.2.2. >> > >> > The Huffman codes used for each alphabet in the "deflate" >> > format have two additional rules: >> > >> > * All codes of a given bit length have lexicographically >> > consecutive values, in the same order as the symbols >> > they represent; >> > >> > * Shorter codes lexicographically precede longer codes. >> > I tried jwiki hcodes in >> > I try Roger's essay >> > http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding >> > >> > hc=: 4 : 0 >> > if. 1=#x do. y >> > else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end. >> > ) >> > >> > hcodes=: 4 : 0 >> > assert. x -:&$ y NB. weights and words have same shape >> > assert. (0<:x) *. 1=#$x NB. weights are non-negative >> > assert. 1 >: L.y NB. words are boxed not more than once >> > w=. ,&.> y NB. standardized words >> > assert. w -: ~.w NB. words are unique >> > t=. 0 {:: x hc w NB. minimal weight binary tree >> > ((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t >> > ) >> > >> > but the coding produced is malformed for zlib. eg, >> > this is what I ran into trouble >> > >> > f1=: 1 256 17 1 1 9 1 >> > f2=: 2 1 0 1 255 0 1536 >> > F=: ,/(f1#f2) >> > A=: i.286 >> > >> > F hcodes A >> > >> > Or a shorter example >> > >> > A1=: i.12 >> > F1=: 2 1 0 0 0 0 0 0 0 0 0 1 >> > >> > F1 hcodes A1 >> > >> > Any idea? >> > >> > -- >> > regards, >> > ==================================================== >> > GPG key 1024D/4434BAB3 2008-08-24 >> > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 >> > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
