Re: [libav-devel] dca xll status update

Niels Möller Wed, 16 Apr 2014 05:48:13 -0700

[email protected] (Niels Möller) writes:

> Next steps:
>
> 1. Do the msb/lsb reassembly.
>
> 2. Figure out how to actually generate any output to the codec user, and
>    d any needed channel permutation.


Now I've done 2 too. A couple of problems.

I'm still testing with the file "Master Audio 5.0 96khz.dts", which has
residual_encode = 0.

* Audio level is really low. When I parse the rice sample values from
  the file, I get input samples with a maximum around 1000. Not sure how
  the inverse prediction should affect the magnitude (but see below).
  And there are almost no lsb bits, only a few of the frames have any
  lsb part at all, and then only a single bit per value. And these are
  supposed to be 24-bit values, so if, e.g., converted to 16-bit wav
  file, all the samples are rounded to 0 or maybe -1.
  
* The core data is 48kHz, 512 samples per frame, while the xll data is
  96kHz, 1024 samples per frame. Since residual_encode = 0, I'm not
  entirely sure what I'm supposed to do with the core data. Should it be
  ignored by encoders which understand xll? For now, I added some code
  to check for sample rate mismatch, and then set avctx->sample_rate,
  avctx->channels and frame->nb_samples based on what's in the xll
  stream.

* The code in dca_parser is not aware of xll, so it claims the stream is
  48 kHz, not 96 kHz. So when I let avconv convert the stream to wav
  format, I get a 48 kHz file, and I think avconv down-samples the data
  I generate.

* The inverse linear prediction. I use this code

    /* Apply predictor. */
    /* NOTE: Processing samples in this order means that the
       predictor is applied to the newly reconstructed samples. */
    for (i = order; i < nsamples; i++) {
        /* FIXME: How large accumulator do we really need? */
        int64_t s;
        unsigned j;

        for (j = s = 0; j < order; j++)
            s += c[j] * samples[i-1-j];

        /* NOTE: Equations seem to imply addition, while the
         * pseudocode seems to use subtraction.*/
        samples[i] -= av_clip ((s + 0x8000) >> 16, -0x1000000, 0xffffff);
    }

  where c are the coefficients, and samples is the vector of (msbs of)
  samples. A difference to the pseudocode is that I don't reverse the
  coefficients, I follow the convention of the equation

    \sum_{k=1}^{order} LP[k] s[n-k]

  Looping in this order and adding the predicted value in-place implies
  that any errors are accumulated. I get quite a lot of frames with
  small samples at the start of the frame and large samples at the end,
  which can't be right.

  Thinking of it, I'd expect that the prediction errors are smaller
  than the audio samples. Then for frames that use prediction, the
  first few raw samples ("part0", not using prediction) should be
  larger than the following values which are only prediction errors.
  And these following samples should be blown up by the inverse
  prediction to the same order of magnitude as the initial samples.

I'll go on debugging the inverse prediction. I'll be grateful for any
hints on that or on the other problems.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] dca xll status update

Reply via email to