Here a few other links ... after reading through the RFC. Not sure if they help, but just sharing from my own research into assisting on this topic
https://github.com/evegard/pngview/blob/master/huffman.c#L54 And a fork of the putty version with dynamic huffman coding: http://rc.quest.com/viewvc/putty/trunk/halibut/deflate.c?diff_format=s&revision=2&view=markup Or just generally googling some of the code from the RFC: https://www.google.com/search?q=next_code%5Blen%5D%2B%2B%3B&oq=next_code%5Blen%5D%2B%2B%3B&aqs=chrome..69i57.387j0j7&sourceid=chrome&es_sm=93&ie=UTF-8#q=next_code%5Blen%5D%2B%2B%3B&start=20 Using the code from http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding, I got stuck trying to match a simple example to the binary tree in the RFC: From the RFC: /\ Symbol Code 0 1 ------ ---- / \ A 00 /\ B B 1 0 1 C 011 / \ D 010 A /\ 0 1 / \ D C (4#1) hcodes 'ABCD' ┌───┬───┬───┬───┐ │0 0│0 1│1 0│1 1│ └───┴───┴───┴───┘ Per the RFC, ideally that should match this? '00';'1';'011';'010' From there, it seems like a pretty straightforward exercise to transliterate the C code from the RFC into J code to recode the example to: Symbol Code ------ ---- A 10 B 0 C 110 D 111 I would probably start with a looping construct like what's in the RFC and then figure out a more J way to do it, but first I would need to figure out how to create the binary tree in that initial format. On Wed, Sep 10, 2014 at 7:41 PM, bill lam <[email protected]> wrote: > Thanks Joe, > putty only use zlib static huffman for encoding so that it does not build > any huffman dictionary table. > > The zlib static huffman code does not care about individual symbol's > frequency, it just encode 0 to 286 into bits, see section 3.2.6. > On Sep 11, 2014 1:26 AM, "Joe Bogner" <[email protected]> wrote: > >> You've already likely considered this, but if it were me I would compare >> results to a working implementation. The one from putty seems pretty clean >> and standalone: >> https://raw.githubusercontent.com/grumpydev/PortablePuTTY/master/SSHZLIB.C >> . I was able to compile it on windows no problem and I assume it'd be fine >> on linux as well. >> >> On Wed, Sep 10, 2014 at 1:00 PM, Raul Miller <[email protected]> >> wrote: >> >> > I think the use of the term "consecutive" rather than "sequential" is >> > telling. >> > >> > The described algorithm is: compute the huffman code lengths: >> > #@>F1 hcodes A1 >> > 1 3 7 7 6 6 6 6 6 6 6 2 >> > >> > Then assign ascending huffman codes first in length order and then >> > within codes of the same length. >> > >> > Taken literally, that might be something like this: >> > >> > H=: 4 :0 >> > L=.#@> x hcodes y >> > U=.~.L >> > ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L >> > ) >> > >> > ":@>F1 H A1 >> > 0 >> > 1 1 0 >> > 1 1 1 0 0 1 0 >> > 1 1 1 0 0 1 1 >> > 1 1 1 0 0 0 >> > 1 1 1 0 0 1 >> > 1 1 1 0 1 0 >> > 1 1 1 0 1 1 >> > 1 1 1 1 0 0 >> > 1 1 1 1 0 1 >> > 1 1 1 1 1 0 >> > 1 0 >> > >> > But is this correct? Is it actually safe to leave the results like >> > this - with all codes of the same length being consecutive to each >> > other? >> > >> > F (hcodes -:&:(#@>) H) A >> > 0 >> > >> > No. >> > >> > So... "consecutive" must refer only to the values used and not their >> > order within the result. >> > >> > Perhaps something like this: >> > >> > deflatecodes=:4 :0 >> > L=.#@> x hcodes y >> > U=.~.L >> > R=. ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L >> > R/:;(</. i.@#)L >> > ) >> > >> > F (hcodes -:&:(#@>) deflatecodes) A >> > 1 >> > >> > There should be a better way of doing this, but this should at least >> > get you started. >> > >> > Thanks, >> > >> > -- >> > Raul >> > >> > >> > On Wed, Sep 10, 2014 at 10:45 AM, bill lam <[email protected]> wrote: >> > > For huffman coding used in zlib: >> > > https://www.ietf.org/rfc/rfc1951.txt section 3.2.2. >> > > >> > > The Huffman codes used for each alphabet in the "deflate" >> > > format have two additional rules: >> > > >> > > * All codes of a given bit length have lexicographically >> > > consecutive values, in the same order as the symbols >> > > they represent; >> > > >> > > * Shorter codes lexicographically precede longer codes. >> > > I tried jwiki hcodes in >> > > I try Roger's essay >> > > http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding >> > > >> > > hc=: 4 : 0 >> > > if. 1=#x do. y >> > > else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end. >> > > ) >> > > >> > > hcodes=: 4 : 0 >> > > assert. x -:&$ y NB. weights and words have same shape >> > > assert. (0<:x) *. 1=#$x NB. weights are non-negative >> > > assert. 1 >: L.y NB. words are boxed not more than once >> > > w=. ,&.> y NB. standardized words >> > > assert. w -: ~.w NB. words are unique >> > > t=. 0 {:: x hc w NB. minimal weight binary tree >> > > ((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t >> > > ) >> > > >> > > but the coding produced is malformed for zlib. eg, >> > > this is what I ran into trouble >> > > >> > > f1=: 1 256 17 1 1 9 1 >> > > f2=: 2 1 0 1 255 0 1536 >> > > F=: ,/(f1#f2) >> > > A=: i.286 >> > > >> > > F hcodes A >> > > >> > > Or a shorter example >> > > >> > > A1=: i.12 >> > > F1=: 2 1 0 0 0 0 0 0 0 0 0 1 >> > > >> > > F1 hcodes A1 >> > > >> > > Any idea? >> > > >> > > -- >> > > regards, >> > > ==================================================== >> > > GPG key 1024D/4434BAB3 2008-08-24 >> > > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 >> > > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 >> > > ---------------------------------------------------------------------- >> > > For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm >> > >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
