On Wed, Jun 4, 2008 at 10:55 AM, Brian Butterworth
<[EMAIL PROTECTED]> wrote:
> I thought they were trying to do OCR on the captions from the DVB-T stream.
>
> What I was saying was that the old Freeview version of BBC Parliament used
> to have a quarter-screen picture and the information that is now in the
> Astons was provided using MHEG5.  This was clear text (to keep the bandwidth
> down) not bitmap graphics.

Forgive my ignorance, but what is an Aston?

> OCRing is never going to be brilliant, given the semi-transparent nature of
> the captions on BBC Parliament.
>
> However, a clear text feed of the data would keep the data pure, surely?

The machines that put the captions up on the screen have internal
text-based logs, to which we have access.  However, since this is
basically just pulling logfiles off a set of operational machines this
access isn't 100% reliable.  The data in the log files is of variable
quality, since there are some speeches that are not captioned, and
other times captions aren't actually speeches (e.g. reaction shot of
previous speaker during a long speech can prompt a back and forth of
captions, even though the same person is speaking throughout the
changeover in captions).  So although we use the logfiles to get an
approximate fix, we had to resort to the timestamping game for
accuracy.

Hope that helps,

-- etienne
-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/[email protected]/

Reply via email to