On Sat, Sep 14, 2013 at 09:59:20AM -0400, Justin Ruggles wrote:
> On 09/14/2013 03:46 AM, Kostya Shishkov wrote:
> > On Fri, Sep 13, 2013 at 04:31:00PM -0400, Justin Ruggles wrote:
> >> ---
> >> doc/general.texi | 2 +-
> >> libavcodec/webp.c | 1173
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++-
> >> 2 files changed, 1164 insertions(+), 11 deletions(-)
> >
> > still no Changelog entry?
>
> Will fix.
>
[...]
> >> +
> >> +enum ImageRole {
> >> + /* Stores the actual pixels of the image. */
> >> + IMAGE_ROLE_ARGB,
> >> +
> >> + /* Stores the meta Huffman codes. The red and green components of a
> >> pixel
> >> + define the meta Huffman code used in a particular block of the ARGB
> >> + image. */
> >> + IMAGE_ROLE_ENTROPY,
> >> +
> >> + /* Stores the metadata for Predictor Transform. The green component
> >> of a
> >> + pixel defines which of the 14 predictors is used within a
> >> particular
> >> + block of the ARGB image. */
> >> + IMAGE_ROLE_PREDICTOR,
> >> +
> >> + /* It is created by ColorTransformElement values for different blocks
> >> of
> >> + the image. Each ColorTransformElement is treated as a pixel whose
> >> alpha
> >> + component is 255, red component is cte.red_to_blue, green
> >> component is
> >> + cte.green_to_blue and blue component is cte.green_to_red. */
> >> + IMAGE_ROLE_COLOR_TRANSFORM,
> >> +
> >> + /* Color indexing image: An array of of size color_table_size (up to
> >> 256
> >> + ARGB values) storing the metadata for the Color Indexing
> >> Transform. This
> >> + is stored as an image of width color_table_size and height 1. */
> >> + IMAGE_ROLE_COLOR_INDEXING,
> >> +
> >> + IMAGE_ROLE_NB,
> >> +};
> >
> > WTF, simply WTF
> > Even with comments it's impossible to comprehend what is it for.
> > Was it borrowed from the original specification as is?
>
> Yes, that's from the spec. Basically the information for each of those
> is coded as an "entropy coded image". All are essentially coded the same
> way and are treated as an ARGB image (though not all color components
> are always used for real information). I do admit the terminology from
> the spec is not particularly good... I'll try to rewrite them in a less
> confusing way.
All frame's a stage,
And all images are merely players:
They have their lifecycle ends and starts,
Its act being some roles. At first, the entropy image,
...
etc, etc.
> >> +
> >> +typedef struct HuffReader {
> >> + VLC vlc;
> >> + int simple;
> >
> > I'd rather call it binary, not simple.
>
> That term is from the spec. And yeah, it's either no code (typically
> used for all 0's) or binary.
Okay, keep it then (let people curse the spec).
> >> + int nb_symbols;
> >> + uint16_t simple_symbols[2];
> >> +} HuffReader;
> >> +
> >> +typedef struct HuffmanGroup {
> >> + HuffReader huffman_code[HUFFMAN_CODES_PER_META_CODE];
> >> +} HuffmanGroup;
> >> +
> >> +typedef struct ImageContext {
> >> + enum ImageRole role;
> >> + AVFrame *frame;
> >> + int color_cache_bits;
> >> + uint8_t *color_cache;
> >> + int num_huffman_groups;
> >> + HuffmanGroup *huffman_groups;
> >> +} ImageContext;
> >> +
> >> +typedef union {
> >> + uint8_t u8;
> >> + int8_t s8;
> >> +} av_alias alias8;
> >
> > looks unneeded - we have such thing in lavu for sure
>
> Will someone please point this out? I obviously looked there first
> before putting it here and didn't find anything.
You just need some functions like int8_t u8tos8(uint8_t val) in lavc/mathops.h
(patchisrequired) and in the most cases you don't need such type at all.
> >> typedef struct WebPContext {
> >> VP8Context v;
> >> + AVCodecContext *avctx;
> >> int initialized;
> >> + GetBitContext gb;
> >> GetByteContext alpha_gb;
> >> uint8_t has_alpha;
> >> int width;
> >> int height;
> >> +
> >> + int lossless;
> >> + ImageContext image[IMAGE_ROLE_NB];
> >> + enum TransformType transforms[4];
> >> + int nb_transforms;
> >> + int subtract_green;
> >> + int predictor_block_size;
> >> + int color_block_size;
> >> + int color_index_width_bits;
> >> + int reduced_width;
> >> + int use_entropy_image;
> >> + int entropy_size;
> >> } WebPContext;
> >>
> >> +static av_always_inline uint8_t *get_pixel(AVFrame *frame, int x, int y)
> >> +{
> >> + if (x < 0 || y < 0)
> >> + return NULL;
> >> + if (x >= frame->width) {
> >> + x = 0;
> >> + y++;
> >> + }
> >> + if (y >= frame->height)
> >> + return NULL;
> >> +
> >> + return frame->data[0] + y * frame->linesize[0] + 4 * x;
> >> +}
> >
> > looks unneeded
>
> The part that is useful is selecting the first pixel on the next row
> when getting the top-right pixel from a pixel on the far right column.
> Plus it simplifies the code. The checks are probably unnecessary though.
without the checks it boils down to the last line
> >> +static av_always_inline uint8_t add_comp(int a, int b)
> >> +{
> >> + return (a + b) & 0xFF;
> >> +}
> >
> > looks unneeded (especially masking) and you seem to use it on
> > uint8_t->uint8_t
> > anyway
>
> Ok, I'll go through and make sure everything still works without it.
>
> >> +
> >> +static av_always_inline uint8_t avg2_comp(int a, int b)
> >> +{
> >> + return (a + b) / 2;
> >> +}
> >> +
> >> +static av_always_inline uint8_t clamp_add_subtract_full(int a, int b, int
> >> c)
> >> +{
> >> + return av_clip_uint8(a + b - c);
> >> +}
> >> +
> >> +static av_always_inline uint8_t clamp_add_subtract_half(int a, int b, int
> >> c)
> >> +{
> >> + int d = avg2_comp(a, b);
> >> + return av_clip_uint8(d + (d - c) / 2);
> >> +}
> >
> > nit: this one looks like it can be simplified a bit
>
> In what way?
(3 * a + 3 * b - 2 * c) >> 2
just saying
[...]
> >> + }
> >> + memset(img, 0, sizeof(*img));
> >> +}
> >> +
> >> +
> >> +/* Differs from get_vlc2() in the following ways:
> >> + * - codes are bit-reversed
> >> + * - assumes 8-bit table to make reversal simpler
> >> + * - assumes max depth of 2 since the max code length for WebP is 15
> >> + */
> >> +static av_always_inline int webp_get_vlc(GetBitContext *gb, VLC_TYPE
> >> (*table)[2])
> >> +{
> >> + int n, nb_bits;
> >> + unsigned int index;
> >> + int code;
> >> +
> >> + OPEN_READER(re, gb);
> >> + UPDATE_CACHE(re, gb);
> >> +
> >> + index = SHOW_UBITS(re, gb, 8);
> >> + index = ff_reverse[index];
> >> + code = table[index][0];
> >> + n = table[index][1];
> >> +
> >> + if (n < 0) {
> >> + LAST_SKIP_BITS(re, gb, 8);
> >> + UPDATE_CACHE(re, gb);
> >> +
> >> + nb_bits = -n;
> >> +
> >> + index = SHOW_UBITS(re, gb, nb_bits);
> >> + index = (ff_reverse[index] >> (8 - nb_bits)) + code;
> >> + code = table[index][0];
> >> + n = table[index][1];
> >> + }
> >> + SKIP_BITS(re, gb, n);
> >> +
> >> + CLOSE_READER(re, gb);
> >> +
> >> + return code;
> >> +}
> >
> > is it possible just to swap/reverse the whole bitstream or there are reasons
> > preventing this (e.g. mix of bit fields in normal order and those VLCs)?
>
> That is exactly the case. It is a mix.
Fine, but I like the format design less and less...
[...]
> >> +#define PARSE_BLOCK_SIZE(w, h) do {
> >> \
> >> + block_bits = get_bits(&s->gb, 3) + 2;
> >> \
> >> + block_size = 1 << block_bits;
> >> \
> >> + blocks_w = FFALIGN((w), block_size) / block_size;
> >> \
> >> + blocks_h = FFALIGN((h), block_size) / block_size;
> >> \
> >
> > FFALIGN() >> block_bits?
>
> will fix
both are right, I just dislike divisions
> >> +} while (0)
> >> +
> >> +static int decode_entropy_image(WebPContext *s, ImageContext *rgba_img)
> >> +{
> >> + int ret, block_bits, block_size, width, blocks_w, blocks_h;
> >> +
> >> + width = s->width;
> >> + if (s->color_index_width_bits > 0)
> >> + width = s->reduced_width;
> >> +
> >> + PARSE_BLOCK_SIZE(width, s->height);
> >> +
> >> + ret = decode_image(s, IMAGE_ROLE_ENTROPY, blocks_w, blocks_h);
> >> + if (ret < 0)
> >> + return ret;
> >
> > Please select better names, because here it looked like a loop (decode_image
> > calls decode_entropy_coded_image and not decode_entropy_image).
>
> Yeah, the names suck a lot. It's partly because the terminology in the
> spec sucks a lot and I can't think of anything better. I should probably
> write up an overview at the top of the file that describes how this
> crazy format is structured. In this case, decode_image() is mainly a
> wrapper around decode_entropy_coded_image() that just sets up the
> AVFrame stuff, so I'll merge those two.
Good.
> >> +
> >> + s->use_entropy_image = 1;
> >> + s->entropy_size = block_size;
> >> +
> >> + /* the number of huffman groups is determined by the maximum group
> >> number
> >> + coded in the entropy image */
> >> + {
> >> + ImageContext *img;
> >> + int x, y, max;
> >> + uint8_t *p;
> >> +
> >> + img = &s->image[IMAGE_ROLE_ENTROPY];
> >> + max = 0;
> >> + for (y = 0; y < img->frame->height; y++) {
> >> + for (x = 0; x < img->frame->width; x++) {
> >> + p = get_pixel(img->frame, x, y);
> >> + max = FFMAX(max, p[2]);
> >> + }
> >> + }
> >> + rgba_img->num_huffman_groups = max + 1;
> >> + }
> >
> > why such block?
>
> no particular reason. I'll change it.
please note that I've pointed only to two of them, there might be more
[...]
> >> +/* PRED_MODE_AVG_AVG_L_TL_AVG_T_TR */
> >> +static void inv_predict_10(uint8_t *p, const uint8_t *p_l, const uint8_t
> >> *p_tl,
> >> + const uint8_t *p_t, const uint8_t *p_tr)
> >> +{
> >> + p[0] = avg2_comp(avg2_comp(p_l[0], p_tl[0]), avg2_comp(p_t[0],
> >> p_tr[0]));
> >> + p[1] = avg2_comp(avg2_comp(p_l[1], p_tl[1]), avg2_comp(p_t[1],
> >> p_tr[1]));
> >> + p[2] = avg2_comp(avg2_comp(p_l[2], p_tl[2]), avg2_comp(p_t[2],
> >> p_tr[2]));
> >> + p[3] = avg2_comp(avg2_comp(p_l[3], p_tl[3]), avg2_comp(p_t[3],
> >> p_tr[3]));
> >
> > p_l[0] + p_tl[0] + p_t[0] + p_tr[0] >> 2 ?
>
> That's not the same due to rounding.
> Example: 2, 3, 7, 12
> avg(2,3)=2; avg(7,12)=9; avg(2,9)=5
> 2 + 3 + 7 + 12 >> 2 = 6
indeed, disregard it then
If I were Niedermayer I'd suggest usign SWAR
unsigned PL = AV_RN32(p_l), PTL = AV_RN32(p_tl), ....;
A0 = (PL & PTL) + ((PL ^ PTL) >> 1);
A1 = (PT & PTR) + ((PT ^ PTR) >> 1);
AV_WN32(p, (A0 & A1) + ((A0 ^ A1) >> 1));
but since I'm not I don't.
> > honestly, those avg2_comp() everywhere do not improve anything
>
> Ok, I can expand it all out if you think it's clearer.
IMO having (a + b) >> 1 inplace is clearer.
[...]
> >> +static int apply_color_transform(WebPContext *s)
> >> +{
> >> + ImageContext *img, *cimg;
> >> + int x, y, cx, cy;
> >> + uint8_t *p, *cp;
> >> + int block_size = s->color_block_size;
> >> +
> >> + img = &s->image[IMAGE_ROLE_ARGB];
> >> + cimg = &s->image[IMAGE_ROLE_COLOR_TRANSFORM];
> >> +
> >> + for (y = 0; y < img->frame->height; y++) {
> >> + for (x = 0; x < img->frame->width; x++) {
> >> + alias8 green_to_red, green_to_blue, red_to_blue;
> >> + alias8 green, red;
> >> + int blue;
> >> +
> >> + cx = x / block_size;
> >> + cy = y / block_size;
> >> + cp = get_pixel(cimg->frame, cx, cy);
> >> + red_to_blue.u8 = cp[1];
> >> + green_to_blue.u8 = cp[2];
> >> + green_to_red.u8 = cp[3];
> >> +
> >> + p = get_pixel(img->frame, x, y);
> >> + red.u8 = p[1];
> >> + green.u8 = p[2];
> >> + blue = p[3];
> >> +
> >> + red.u8 = add_comp(red.u8,
> >> color_transform_delta(green_to_red.s8, green.s8));
> >> + blue +=
> >> color_transform_delta(green_to_blue.s8, green.s8);
> >> + blue = add_comp(blue,
> >> color_transform_delta(red_to_blue.s8, red.s8));
> >
> > hmm?
>
> Yeah, it's pretty weird. I'll see if I can break it down further to
> simplify it without breaking the math.
I was mostly pointing to the fact you can use += on all three lines instead of
one. And that you need make_signed() function essentially for
color_transform_delta() arguments instead of alias8 (or even hide the
conversion inside that function since it's not needed anywhere else).
> >> +
> >> + p[1] = red.u8;
> >> + p[3] = blue;
> >> + }
> >> + }
> >> + return 0;
> >> +}
> >> +
[...]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel