Re: [FFmpeg-devel] Enhancement layers in FFmpeg

2022-08-01 Thread Soft Works


> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Niklas Haas
> Sent: Monday, August 1, 2022 3:59 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] Enhancement layers in FFmpeg
> 
> On Mon, 01 Aug 2022 13:17:12 + Soft Works 
> wrote:
> > From my (rather limited) angle of view, my thoughts are these:
> >
> > When decoding these kinds of sources, a user would typically not
> only
> > want to do the processing in hardware but the decoding as well.
> >
> > I think we cannot realistically expect that any of the hw decoders
> > will add support for this in the near future. As we cannot modify
> > those ourselves, the only way to do such processing would be a
> > hardware filter. I think, the EL data would need to be attached
> > to frames as some kind of side data (or similar) and get uploaded
> > by the hw filter (internally) which will apply the EL data.
> 
> If both the BL and the EL are separate fully coded bitstreams, then
> could we instantiate two independent HW decoder instances to decode
> the
> respective planes?

Sure. TBH, I didn't know that the EL data is encoded in the same
way. I wonder how those frames would look like when viewed standalone..


> > IMO it would be desirable when both of these things would/could be
> > done in a single operation.
> 
> For Dolby Vision we have little choice in the matter. The EL
> application
> needs to happen *after* chroma interpolation, PQ linearization, IPT
> matrix application, and poly/MMR reshaping. These are currently all
> on-GPU processes in the relevant video output codebases.
> 
> So for Dolby Vision that locks us into the design where we merely
> expose
> the EL planes as part of the AVFrame and leave it to be the user's
> problem 

If ffmpeg cannot apply it, then I don't think there will be many users 
being able to make some use of it :-)


> (or the problem of filters like `vf_libplacebo`).

Something I always wanted to ask you: is it even thinkable to port
this to a CPU implementation (with reasonable performance)?


> An open question (for me) is whether or not this is required for
> SVC-H264, SHVC, AV1-SVC etc.
> 
> > As long as it doesn't have its own format, its own start time,
> > resolution, duration, color space/transfer/primaries, etc..
> > I wouldn’t say that it's a frame.
> 
> Indeed, it seems like the EL data is tied directly to the BL data for
> the formats I have seen so far. So they are just like extra planes on
> the AVFrame - and indeed, we could simply use extra data pointers
> here
> (we already have room for 8).

Hendrik's idea makes sense to me when this is not just some
data but real frames, decoded with a regular decoder.
Yet I don't know anything about the other enhancement cases either.

Best regards,
softworkz




___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Enhancement layers in FFmpeg

2022-08-01 Thread Niklas Haas
On Mon, 01 Aug 2022 13:17:12 + Soft Works  wrote:
> From my (rather limited) angle of view, my thoughts are these:
> 
> When decoding these kinds of sources, a user would typically not only
> want to do the processing in hardware but the decoding as well.
> 
> I think we cannot realistically expect that any of the hw decoders
> will add support for this in the near future. As we cannot modify 
> those ourselves, the only way to do such processing would be a 
> hardware filter. I think, the EL data would need to be attached 
> to frames as some kind of side data (or similar) and get uploaded 
> by the hw filter (internally) which will apply the EL data.

If both the BL and the EL are separate fully coded bitstreams, then
could we instantiate two independent HW decoder instances to decode the
respective planes?

> IMO it would be desirable when both of these things would/could be
> done in a single operation.

For Dolby Vision we have little choice in the matter. The EL application
needs to happen *after* chroma interpolation, PQ linearization, IPT
matrix application, and poly/MMR reshaping. These are currently all
on-GPU processes in the relevant video output codebases.

So for Dolby Vision that locks us into the design where we merely expose
the EL planes as part of the AVFrame and leave it to be the user's
problem (or the problem of filters like `vf_libplacebo`).

An open question (for me) is whether or not this is required for
SVC-H264, SHVC, AV1-SVC etc.

> As long as it doesn't have its own format, its own start time,
> resolution, duration, color space/transfer/primaries, etc..
> I wouldn’t say that it's a frame.

Indeed, it seems like the EL data is tied directly to the BL data for
the formats I have seen so far. So they are just like extra planes on
the AVFrame - and indeed, we could simply use extra data pointers here
(we already have room for 8).

> 
> Best regards,
> softworkz
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Enhancement layers in FFmpeg

2022-08-01 Thread Hendrik Leppkes
On Mon, Aug 1, 2022 at 1:25 PM Niklas Haas  wrote:
>
> Hey,
>
> We need to think about possible ways to implement reasonably-transparent
> support for enhancement layers in FFmpeg. (SVC, Dolby Vision, ...).
> There are more open questions than answers here.
>
> From what I can tell, these are basically separate bitstreams that carry
> some amount of auxiliary information needed to reconstruct the
> high-quality bitstream. That is, they are not independent, but need to
> be merged with the original bitstream somehow.
>
> How do we architecturally fit this into FFmpeg? Do we define a new codec
> ID for each (common/relevant) combination of base codec and enhancement
> layer, e.g. HEVC+DoVi, H.264+SVC, ..., or do we transparently handle it
> for the base codec ID and control it via a flag? Do the enhancement
> layer packets already make their way to the codec, and if not, how do we
> ensure that this is the case?

EL on Blu-rays are a separate stream, so that would need to be handled
in some fashion. Unless it wouldn't. See below.

>
> Can the decoder itself recursively initialize a sub-decoder for the
> second bitstream? And if so, does the decoder apply the actual
> transformation, or does it merely attach the EL data to the AVFrame
> somehow in a way that can be used by further filters or end users?

My main question is, how closely related are those streams?
I know that Dolby EL can be decoded basically entirely separately from
the main video stream. But EL might be the special case here. I have
no experience with SVC.

If the enhancement layer is entirely independent, like Dolby EL,
should avcodec need to do anything? It _can_ decode the stream today,
a user-application could write code that decodes both the main stream
and the EL stream and links them together, without any changes in
avcodec.
Do we need to complicate this situation by forcing this into avcodec?

Decoding them in entirely separate decoder instances has the advantage
of being able to use Hardware for the main one, software for the EL,
or both in hardware, or whatever one prefers.

Of course this applies to the special situation of Dolby EL which is
entirely independent, at least in its primary source - Blu-ray. I
think MKV might mix both into one stream, which is an unfortunate
design decision on their part.

avfilter for example is already setup to synchronize two incoming
streams (for eg. overlay), so the same mechanic could be used to pass
it to a processing filter.

>
> (What about the case of Dolby Vision, which iirc requires handling the
> DoVi RPU metadata before the EL can be applied? What about instances
> where the user wants the DoVi/EL application to happen on GPU, e.g. via
> libplacebo in mpv/vlc?)
>

Yes, processing should be left to dedicated filters.

> How does this metadata need to be attached? A second AVFrame reference
> inside the AVFrame? Raw data in a big side data struct?

For Dolby EL, no attachment is necessary if we follow the above
concept of just not having avcodec care.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Enhancement layers in FFmpeg

2022-08-01 Thread Soft Works


> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Niklas Haas
> Sent: Monday, August 1, 2022 1:25 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] Enhancement layers in FFmpeg
> 
> Hey,
> 
> We need to think about possible ways to implement reasonably-
> transparent
> support for enhancement layers in FFmpeg. (SVC, Dolby Vision, ...).
> There are more open questions than answers here.
> 
> From what I can tell, these are basically separate bitstreams that
> carry
> some amount of auxiliary information needed to reconstruct the
> high-quality bitstream. That is, they are not independent, but need
> to
> be merged with the original bitstream somehow.
> 
> How do we architecturally fit this into FFmpeg? Do we define a new
> codec
> ID for each (common/relevant) combination of base codec and
> enhancement
> layer, e.g. HEVC+DoVi, H.264+SVC, ..., or do we transparently handle
> it
> for the base codec ID and control it via a flag? Do the enhancement
> layer packets already make their way to the codec, and if not, how do
> we
> ensure that this is the case?
> 
> Can the decoder itself recursively initialize a sub-decoder for the
> second bitstream? And if so, does the decoder apply the actual
> transformation, or does it merely attach the EL data to the AVFrame
> somehow in a way that can be used by further filters or end users?

From my (rather limited) angle of view, my thoughts are these:

When decoding these kinds of sources, a user would typically not only
want to do the processing in hardware but the decoding as well.

I think we cannot realistically expect that any of the hw decoders
will add support for this in the near future. As we cannot modify 
those ourselves, the only way to do such processing would be a 
hardware filter. I think, the EL data would need to be attached 
to frames as some kind of side data (or similar) and get uploaded 
by the hw filter (internally) which will apply the EL data.


(I have no useful thoughts for sw decoding) 


> (What about the case of Dolby Vision, which iirc requires handling
> the
> DoVi RPU metadata before the EL can be applied? What about instances
> where the user wants the DoVi/EL application to happen on GPU, e.g.
> via
> libplacebo in mpv/vlc?)

IMO it would be desirable when both of these things would/could be
done in a single operation.

> How does this metadata need to be attached? A second AVFrame
> reference
> inside the AVFrame? Raw data in a big side data struct?

As long as it doesn't have its own format, its own start time,
resolution, duration, color space/transfer/primaries, etc..
I wouldn’t say that it's a frame.

Best regards,
softworkz
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".