Hi Deron,

> I was going to actually test this with some old broadcast equipment I have 
> just dying for a purpose, but I don't see how to generate AV_PKT_DATA_A53_CC 
> side packet data except using the Decklink capture. I have A53 documentation, 
> but it just refers to CEA-708 (or SMPTE 334, or ... what an unraveling ball 
> of yarn it is. Looks like I could spend a months income on standards just 
> trying to learn how this is encoded).

Yeah.  You could certainly spend a good bit of cash if you had to buy the 
individual specs.  Worth noting that the ATSC specs are freely available though 
on their website, and the CEA-708 is largely described in the FCC specification 
(not a substitute for the real thing, but good enough for the casual reader).  
SMPTE has a “digital library” where you can get access to *all* their specs 
with a subscription of around $600/year.  It’s not ideal for a 
non-professional, but for people who *need* the specs it’s way cheaper than 
buying them piecemeal for $120/spec.

> 
> On a side note, can AV_PKT_DATA_A53_CC be used for something besides CEA-708? 
> Not sure I understand the line between A53 CC encoding (which is at least in 
> part what this generates, right?) and CEA-708 (which is what this takes, 
> right?) and why this side data is called A53_CC?
> 
> I know these questions are outside the scope that you were asking…
> 
No problem.  I should really write a primer on this stuff since there are a 
whole bunch of specs which are inter-related.  Briefly….

CEA-708 is what non-technical people typically consider to be “digital closed 
captions”.  They represent the standard that replaces old fashioned NTSC closed 
captions, which were described in EIA/CEA-608.  The spec describes what could 
be characterized as a protocol stack of functionality, including transport 
through presentation layers (i.e. how the captions are constructed, rules for 
how to render them on-screen, etc).  

CEA-708 also includes a construct for tunneling old CEA-608 packets.  In fact, 
most CEA-708 streams are really just up-converted from CEA-608, since the FCC 
requires both to be supported and 608 is a subset in functionality of 708.  On 
the other hand, you can’t typically down convert 708 to 608 since there are a 
bunch of formatting codes in 708 which have no corresponding capability in 608. 
 If you’re using VLC or most other applications, they will claim to render 708 
captions, but they’re really just rendering the 608 captions contained in 708.

One component of the CEA-708 spec describes a “CDP”, which is “Caption 
Distribution Packet”.  This is a low-level packet format which includes not 
just multiple caption streams but also timecodes and service data (e.g. caption 
languages, etc).  CDP packets can be sent over a number of different physical 
transports, including old-fashioned serial ports.

SMPTE 334M describes how to transport CEA-708 CDP packets over an SDI link in 
the VANC area of the frame.

A53 refers to the ATSC A/53 specification, which basically refers to how 
digital TV is transmitted over-the-air.  One part of that spec includes how to 
embed CEA-708 captions into an MPEG2 transport stream.  The A/53 spec basically 
says how to embed the CEA-708 caption bytes into an MPEG-2 stream, and then 
refers you to CEA-708 for the details of what to do with those bytes.

Both the CEA-708 CDP format and A/53 come down to a series of three byte 
packets which contain the actual captioning data.  This corresponds to what is 
being serialized in AV_PKT_DATA_A53_CC.  In order to encode an SDI feed into an 
MPEG-2 stream, you would need to deconstruct the CDP, extract the captioning 
bytes, and load them into the side data packet.  Once that’s done, the avpacket 
is handed off to an H.264/MPEG-2 video encoder, which knows how to take those 
captioning bytes and embed them into the compressed video (using the MPEG-2 
user_data field if it’s MPEG-2 video, or the SEI field if it’s H.264).

That series of three-byte packets is essentially the “lowest common 
denominator” representing the captioning data (assuming you only care about 
closed captions and not timecodes or service info).  I have use cases where 
this stuff should really be preserved, and am weighing the merits of 
introducing a new side data format for the CDP which preserves all the info, 
and then encoders can extract what they need.  There are plusses/minuses to 
this approach and it’s still under consideration.

I hope that gives you a bit more background.

Cheers,

Devin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to