On 24/09/2013 00:44, dinkypumpkin wrote: > On 23/09/2013 23:16, Rob Dixon wrote: > > Unfortunately, that's not how the subtitles arrive from the Beeb. There > are line breaks within a single speaker's lines, and sometimes no line > break or other structural change to demarcate the transition between > speakers in a single subtitle.
That isn't my experience - certainly not across all programmes. Current editions of QI, for instance, have the text split between voices in several places. The splits also seem to be at punctuation where possible, rather than at an arbitrary half-way point. I believe the subtitles XML file contains the text formatted as it was transcribed and intended to be viewed. > The old format has text colour changes to mark transition between > speakers in a single subtitle, but the newer format doesn't appear to > use that device. I suspect that depends on the source of the subtitles. Captions that the BBC has commissioned itself are uncoloured, but if they come from a third party - often subtitles for films are of this type - then they can vary a lot in style and content, and for instance may be labelled with the speaker's name, or placed on the appropriate side of the screen. My vote is with keeping the newlines as they are in the XML when translating to SRT. (I have a short Perl program that does just that using XML::Parser::Lite if it is of any interest.) Rob _______________________________________________ get_iplayer mailing list [email protected] http://lists.infradead.org/mailman/listinfo/get_iplayer

