I would use 2 expressions, one for the 2 first lines that seem to have a standard format, and one for the tags since their number seems to be arbitrary.
The first expression would be something like this: search for : \d+\r\d\d:\d\d:\d\d,\d\d\d --> \d\d:\d\d:\d\d,\d\d\d\r replace with nothing That way you get rid of the ID and time codes: the kid has<font color="#E5E5E5"> no idea what you mean</font> <font color="#E5E5E5">because you don't know what you mean</font><font color="#CCCCCC"> but</font> you mean something<font color="#CCCCCC"> like well don't be a</font> pain<font color="#E5E5E5"> in the neck</font><font color="#CCCCCC"> don't be a diva don't</font> be a don't be narcissistic<font color="#CCCCCC"> and something</font> <font color="#CCCCCC">like</font><font color="#E5E5E5"> that but you</font><font color="#CCCCCC"> actually mean</font> something<font color="#E5E5E5"> really important what you mean</font> And the second one would be: search for <[^>]+> replace with nothing And the result is: the kid has no idea what you mean because you don't know what you mean but you mean something like well don't be a pain in the neck don't be a diva don't be a don't be narcissistic and something like that but you actually mean something really important what you mean Jean-Christophe > On Jan 19, 2019, at 11:00, Dj <[email protected]> wrote: > > Hello, my father is hard of hearing and I'd like to send him some closed > caption files so he can read the content like you would a book. Is there an > easy way to strip data out of the below example so it's only text, and not > timestamps and tags? I've been failing trying to come up with an expression > the last few hours.... > > Original text is like below and I'm trying to fetch only the the spoken/text > elements so they sit next to each other without gaps. Is this a multi-part > operation, or is it possible with one expression? Thanks! > > 1092 > 00:40:25,710 --> 00:40:29,220 > the kid has<font color="#E5E5E5"> no idea what you mean</font> > > 1093 > 00:40:27,119 --> 00:40:31,019 > <font color="#E5E5E5">because you don't know what you mean</font><font > color="#CCCCCC"> but</font> > > 1094 > 00:40:29,219 --> 00:40:33,149 > you mean something<font color="#CCCCCC"> like well don't be a</font> > > 1095 > 00:40:31,019 --> 00:40:35,369 > pain<font color="#E5E5E5"> in the neck</font><font color="#CCCCCC"> don't be > a diva don't</font> > > 1096 > 00:40:33,150 --> 00:40:36,930 > be a don't be narcissistic<font color="#CCCCCC"> and something</font> > > 1097 > 00:40:35,369 --> 00:40:39,059 > <font color="#CCCCCC">like</font><font color="#E5E5E5"> that but > you</font><font color="#CCCCCC"> actually mean</font> > > 1098 > 00:40:36,929 --> 00:40:42,089 > something<font color="#E5E5E5"> really important what you mean</font> > > > -- > This is the BBEdit Talk public discussion group. If you have a > feature request or need technical support, please email > "[email protected]" rather than posting to the group. > Follow @bbedit on Twitter: <https://www.twitter.com/bbedit > <https://www.twitter.com/bbedit>> > --- > You received this message because you are subscribed to the Google Groups > "BBEdit Talk" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/bbedit > <https://groups.google.com/group/bbedit>. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune -- This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "[email protected]" rather than posting to the group. Follow @bbedit on Twitter: <https://www.twitter.com/bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/bbedit.
