This is strange since every author must had decode its own encoded data as a smoke test.
Did you test if huffman code or bit lengths it produced was correct or not, ie it is a prefix coding and it satisfy the 2 rules in rfc. Чт, 11 сен 2014, Joe Bogner написал(а): > unfortunately the dynamic coding in the putty fork doesn't seem to work: > > deflate -c deflate.c > out > deflate -d out > > decoding error: incorrect data checksum > > > it works fine with static tables > > C:\temp>echo ABCD > ABCD > > C:\temp>deflate -c ABCD > out > > C:\temp>deflate -d out > ABCD > > I added some debugging code to determine that deflating deflate.c > would be a dynamic table... Assuming it's broke, I probably wouldn't > use it as a reference implementation after all > > On Thu, Sep 11, 2014 at 3:45 AM, bill lam <[email protected]> wrote: > > the frequencies (guessing from bit lengths) should be something like 2 3 1 1 > > (2 3 1 1) hcodes 'ABCD' > > > > the hard part is the inverse problem: how to get the huffman code with > > prior knowing the bits for each symbol. Your pointer to the putty > > fork looks like helpful. The comment is in lines 861 to 914, the code > > itself in line 915 to 964. Do you know how to express it in J? > > Thanks. > > > > On Thu, Sep 11, 2014 at 2:57 PM, Joe Bogner <[email protected]> wrote: > >> Here a few other links ... after reading through the RFC. Not sure if > >> they help, but just sharing from my own research into assisting on > >> this topic > >> > >> https://github.com/evegard/pngview/blob/master/huffman.c#L54 > >> > >> And a fork of the putty version with dynamic huffman coding: > >> http://rc.quest.com/viewvc/putty/trunk/halibut/deflate.c?diff_format=s&revision=2&view=markup > >> > >> Or just generally googling some of the code from the RFC: > >> https://www.google.com/search?q=next_code%5Blen%5D%2B%2B%3B&oq=next_code%5Blen%5D%2B%2B%3B&aqs=chrome..69i57.387j0j7&sourceid=chrome&es_sm=93&ie=UTF-8#q=next_code%5Blen%5D%2B%2B%3B&start=20 > >> > >> > >> Using the code from > >> http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding, I got stuck > >> trying to match a simple example to the binary tree in the RFC: > >> > >> From the RFC: > >> > >> /\ Symbol Code > >> 0 1 ------ ---- > >> / \ A 00 > >> /\ B B 1 > >> 0 1 C 011 > >> / \ D 010 > >> A /\ > >> 0 1 > >> / \ > >> D C > >> > >> > >> > >> (4#1) hcodes 'ABCD' > >> ┌───┬───┬───┬───┐ > >> │0 0│0 1│1 0│1 1│ > >> └───┴───┴───┴───┘ > >> > >> Per the RFC, ideally that should match this? '00';'1';'011';'010' > >> > >> > >> From there, it seems like a pretty straightforward exercise to > >> transliterate the C code from the RFC into J code to recode the > >> example to: > >> > >> > >> Symbol Code > >> ------ ---- > >> A 10 > >> B 0 > >> C 110 > >> D 111 > >> > >> > >> I would probably start with a looping construct like what's in the RFC > >> and then figure out a more J way to do it, but first I would need to > >> figure out how to create the binary tree in that initial format. > >> > >> On Wed, Sep 10, 2014 at 7:41 PM, bill lam <[email protected]> wrote: > >>> Thanks Joe, > >>> putty only use zlib static huffman for encoding so that it does not build > >>> any huffman dictionary table. > >>> > >>> The zlib static huffman code does not care about individual symbol's > >>> frequency, it just encode 0 to 286 into bits, see section 3.2.6. > >>> On Sep 11, 2014 1:26 AM, "Joe Bogner" <[email protected]> wrote: > >>> > >>>> You've already likely considered this, but if it were me I would compare > >>>> results to a working implementation. The one from putty seems pretty > >>>> clean > >>>> and standalone: > >>>> https://raw.githubusercontent.com/grumpydev/PortablePuTTY/master/SSHZLIB.C > >>>> . I was able to compile it on windows no problem and I assume it'd be > >>>> fine > >>>> on linux as well. > >>>> > >>>> On Wed, Sep 10, 2014 at 1:00 PM, Raul Miller <[email protected]> > >>>> wrote: > >>>> > >>>> > I think the use of the term "consecutive" rather than "sequential" is > >>>> > telling. > >>>> > > >>>> > The described algorithm is: compute the huffman code lengths: > >>>> > #@>F1 hcodes A1 > >>>> > 1 3 7 7 6 6 6 6 6 6 6 2 > >>>> > > >>>> > Then assign ascending huffman codes first in length order and then > >>>> > within codes of the same length. > >>>> > > >>>> > Taken literally, that might be something like this: > >>>> > > >>>> > H=: 4 :0 > >>>> > L=.#@> x hcodes y > >>>> > U=.~.L > >>>> > ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L > >>>> > ) > >>>> > > >>>> > ":@>F1 H A1 > >>>> > 0 > >>>> > 1 1 0 > >>>> > 1 1 1 0 0 1 0 > >>>> > 1 1 1 0 0 1 1 > >>>> > 1 1 1 0 0 0 > >>>> > 1 1 1 0 0 1 > >>>> > 1 1 1 0 1 0 > >>>> > 1 1 1 0 1 1 > >>>> > 1 1 1 1 0 0 > >>>> > 1 1 1 1 0 1 > >>>> > 1 1 1 1 1 0 > >>>> > 1 0 > >>>> > > >>>> > But is this correct? Is it actually safe to leave the results like > >>>> > this - with all codes of the same length being consecutive to each > >>>> > other? > >>>> > > >>>> > F (hcodes -:&:(#@>) H) A > >>>> > 0 > >>>> > > >>>> > No. > >>>> > > >>>> > So... "consecutive" must refer only to the values used and not their > >>>> > order within the result. > >>>> > > >>>> > Perhaps something like this: > >>>> > > >>>> > deflatecodes=:4 :0 > >>>> > L=.#@> x hcodes y > >>>> > U=.~.L > >>>> > R=. ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L > >>>> > R/:;(</. i.@#)L > >>>> > ) > >>>> > > >>>> > F (hcodes -:&:(#@>) deflatecodes) A > >>>> > 1 > >>>> > > >>>> > There should be a better way of doing this, but this should at least > >>>> > get you started. > >>>> > > >>>> > Thanks, > >>>> > > >>>> > -- > >>>> > Raul > >>>> > > >>>> > > >>>> > On Wed, Sep 10, 2014 at 10:45 AM, bill lam <[email protected]> wrote: > >>>> > > For huffman coding used in zlib: > >>>> > > https://www.ietf.org/rfc/rfc1951.txt section 3.2.2. > >>>> > > > >>>> > > The Huffman codes used for each alphabet in the "deflate" > >>>> > > format have two additional rules: > >>>> > > > >>>> > > * All codes of a given bit length have lexicographically > >>>> > > consecutive values, in the same order as the symbols > >>>> > > they represent; > >>>> > > > >>>> > > * Shorter codes lexicographically precede longer codes. > >>>> > > I tried jwiki hcodes in > >>>> > > I try Roger's essay > >>>> > > http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding > >>>> > > > >>>> > > hc=: 4 : 0 > >>>> > > if. 1=#x do. y > >>>> > > else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end. > >>>> > > ) > >>>> > > > >>>> > > hcodes=: 4 : 0 > >>>> > > assert. x -:&$ y NB. weights and words have same shape > >>>> > > assert. (0<:x) *. 1=#$x NB. weights are non-negative > >>>> > > assert. 1 >: L.y NB. words are boxed not more than once > >>>> > > w=. ,&.> y NB. standardized words > >>>> > > assert. w -: ~.w NB. words are unique > >>>> > > t=. 0 {:: x hc w NB. minimal weight binary tree > >>>> > > ((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t > >>>> > > ) > >>>> > > > >>>> > > but the coding produced is malformed for zlib. eg, > >>>> > > this is what I ran into trouble > >>>> > > > >>>> > > f1=: 1 256 17 1 1 9 1 > >>>> > > f2=: 2 1 0 1 255 0 1536 > >>>> > > F=: ,/(f1#f2) > >>>> > > A=: i.286 > >>>> > > > >>>> > > F hcodes A > >>>> > > > >>>> > > Or a shorter example > >>>> > > > >>>> > > A1=: i.12 > >>>> > > F1=: 2 1 0 0 0 0 0 0 0 0 0 1 > >>>> > > > >>>> > > F1 hcodes A1 > >>>> > > > >>>> > > Any idea? > >>>> > > > >>>> > > -- > >>>> > > regards, > >>>> > > ==================================================== > >>>> > > GPG key 1024D/4434BAB3 2008-08-24 > >>>> > > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 > >>>> > > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > >>>> > > ---------------------------------------------------------------------- > >>>> > > For information about J forums see > >>>> > > http://www.jsoftware.com/forums.htm > >>>> > ---------------------------------------------------------------------- > >>>> > For information about J forums see http://www.jsoftware.com/forums.htm > >>>> > > >>>> ---------------------------------------------------------------------- > >>>> For information about J forums see http://www.jsoftware.com/forums.htm > >>>> > >>> ---------------------------------------------------------------------- > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm -- regards, ==================================================== GPG key 1024D/4434BAB3 2008-08-24 gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
