Re: [Jprogramming] zlib huffman coding

Raul Miller Wed, 10 Sep 2014 10:01:49 -0700

I think the use of the term "consecutive" rather than "sequential" is telling.


The described algorithm is: compute the huffman code lengths:
   #@>F1 hcodes A1
1 3 7 7 6 6 6 6 6 6 6 2

Then assign ascending huffman codes first in length order and then
within codes of the same length.

Taken literally, that might be something like this:

H=: 4 :0
  L=.#@> x hcodes y
  U=.~.L
  ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L
)

   ":@>F1 H A1
0
1 1 0
1 1 1 0 0 1 0
1 1 1 0 0 1 1
1 1 1 0 0 0
1 1 1 0 0 1
1 1 1 0 1 0
1 1 1 0 1 1
1 1 1 1 0 0
1 1 1 1 0 1
1 1 1 1 1 0
1 0

But is this correct? Is it actually safe to leave the results like
this - with all codes of the same length being consecutive to each
other?

   F (hcodes -:&:(#@>) H) A
0

No.

So... "consecutive" must refer only to the values used and not their
order within the result.

Perhaps something like this:

deflatecodes=:4 :0
  L=.#@> x hcodes y
  U=.~.L
  R=. ;<@(({.{.U e.~i.&.<:@{.)<@:+"1-@{.{."1 #:@i.@#)/.~L
  R/:;(</. i.@#)L
)

   F (hcodes -:&:(#@>) deflatecodes)  A
1

There should be a better way of doing this, but this should at least
get you started.

Thanks,

-- 
Raul


On Wed, Sep 10, 2014 at 10:45 AM, bill lam <[email protected]> wrote:
> For huffman coding used in zlib:
> https://www.ietf.org/rfc/rfc1951.txt section 3.2.2.
>
>  The Huffman codes used for each alphabet in the "deflate"
>  format have two additional rules:
>
>   * All codes of a given bit length have lexicographically
>   consecutive values, in the same order as the symbols
>   they represent;
>
>   * Shorter codes lexicographically precede longer codes.
> I tried jwiki hcodes in
> I try Roger's essay
> http://www.jsoftware.com/jwiki/Essays/Huffman%20Coding
>
> hc=: 4 : 0
> if. 1=#x do. y
> else. ((i{x),+/j{x) hc (i{y),<j{y [ i=. (i.#x) -. j=. 2{./:x end.
> )
>
> hcodes=: 4 : 0
> assert. x -:&$ y           NB. weights and words have same shape
> assert. (0<:x) *. 1=#$x    NB. weights are non-negative
> assert. 1 >: L.y           NB. words are boxed not more than once
> w=. ,&.> y                 NB. standardized words
> assert. w -: ~.w           NB. words are unique
> t=. 0 {:: x hc w           NB. minimal weight binary tree
> ((< S: 0 t) i. w) { <@(1&=)@; S: 1 {:: t
> )
>
> but the coding produced is malformed for zlib. eg,
> this is what I ran into trouble
>
> f1=: 1 256 17 1 1 9 1
> f2=: 2 1 0 1 255 0 1536
> F=: ,/(f1#f2)
> A=: i.286
>
> F hcodes A
>
> Or a shorter example
>
> A1=: i.12
> F1=: 2 1 0 0 0 0 0 0 0 0 0 1
>
> F1 hcodes A1
>
> Any idea?
>
> --
> regards,
> ====================================================
> GPG key 1024D/4434BAB3 2008-08-24
> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] zlib huffman coding

Reply via email to