You've already likely considered this, but if it were me I would compare results to a working implementation. The one from putty seems pretty clean and standalone: https://raw.githubusercontent.com/grumpydev/PortablePuTTY/master/SSHZLIB.C . I was able to compile it on windows no problem and I assume it'd be fine on linux as well.
On Wed, Sep 10, 2014 at 1:00 PM, Raul Miller <[email protected]> wrote: > I think the use of the term "consecutive" rather than "sequential" is > telling. > > The described algorithm is: compute the huffman code lengths: > #@>F1 hcodes A1 > 1 3 7 7 6 6 6 6 6 6 6 2 > > Then assign ascending huffman codes first in length order and then > within codes of the same length. > > Taken literally, that might be something like this: > > H=: 4 :0 > L=.#@> x hcodes y > U=.~.L > ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L > ) > > ":@>F1 H A1 > 0 > 1 1 0 > 1 1 1 0 0 1 0 > 1 1 1 0 0 1 1 > 1 1 1 0 0 0 > 1 1 1 0 0 1 > 1 1 1 0 1 0 > 1 1 1 0 1 1 > 1 1 1 1 0 0 > 1 1 1 1 0 1 > 1 1 1 1 1 0 > 1 0 > > But is this correct? Is it actually safe to leave the results like > this - with all codes of the same length being consecutive to each > other? > > F (hcodes -:&:(#@>) H) A > 0 > > No. > > So... "consecutive" must refer only to the values used and not their > order within the result. > > Perhaps something like this: > > deflatecodes=:4 :0 > L=.#@> x hcodes y > U=.~.L > R=. ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L > R/:;(</. i.@#)L > ) > > F (hcodes -:&:(#@>) deflatecodes) A > 1 > > There should be a better way of doing this, but this should at least > get you started. > > Thanks, > > -- > Raul > > > On Wed, Sep 10, 2014 at 10:45 AM, bill lam <[email protected]> wrote: > > For huffman coding used in zlib: > > https://www.ietf.org/rfc/rfc1951.txt section 3.2.2. > > > > The Huffman codes used for each alphabet in the "deflate" > > format have two additional rules: > > > > * All codes of a given bit length have lexicographically > > consecutive values, in the same order as the symbols > > they represent; > > > > * Shorter codes lexicographically precede longer codes. > > I tried jwiki hcodes in > > I try Roger's essay > > http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding > > > > hc=: 4 : 0 > > if. 1=#x do. y > > else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end. > > ) > > > > hcodes=: 4 : 0 > > assert. x -:&$ y NB. weights and words have same shape > > assert. (0<:x) *. 1=#$x NB. weights are non-negative > > assert. 1 >: L.y NB. words are boxed not more than once > > w=. ,&.> y NB. standardized words > > assert. w -: ~.w NB. words are unique > > t=. 0 {:: x hc w NB. minimal weight binary tree > > ((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t > > ) > > > > but the coding produced is malformed for zlib. eg, > > this is what I ran into trouble > > > > f1=: 1 256 17 1 1 9 1 > > f2=: 2 1 0 1 255 0 1536 > > F=: ,/(f1#f2) > > A=: i.286 > > > > F hcodes A > > > > Or a shorter example > > > > A1=: i.12 > > F1=: 2 1 0 0 0 0 0 0 0 0 0 1 > > > > F1 hcodes A1 > > > > Any idea? > > > > -- > > regards, > > ==================================================== > > GPG key 1024D/4434BAB3 2008-08-24 > > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 > > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
