I'll give it a go....
"<.+?>" to remove the tags
followed by or "|"
followed by "^[\d:,\s>-]+" to remove the sequence and time stamp so replace 
this "<.+?>|^[\d:,\s>-]+" with nothing

Yielded:
the kid has no idea what you mean
because you don't know what you mean but
you mean something like well don't be a
pain in the neck don't be a diva don't
be a don't be narcissistic and something
like that but you actually mean
something really important what you mean

I didn't care for that so I tweaked to leave the sequence number in 
"<.+?>|^\d{2}:[\d:,\s>-]+"

Yielded:
1092
the kid has no idea what you mean

1093
because you don't know what you mean but

1094
you mean something like well don't be a

1095
pain in the neck don't be a diva don't

1096
be a don't be narcissistic and something

1097
like that but you actually mean

1098
something really important what you mean

On Friday, January 18, 2019 at 9:00:44 PM UTC-5, Dj wrote:
>
> Hello, my father is hard of hearing and I'd like to send him some closed 
> caption files so he can read the content like you would a book. Is there an 
> easy way to strip data out of the below example so it's only text, and not 
> timestamps and tags?  I've been failing trying to come up with an 
> expression the last few hours.... 
>
> Original text is like below and I'm trying to fetch* only the the 
> spoken/text elements so they sit next to each other without gaps*. Is 
> this a multi-part operation, or is it possible with one expression? Thanks!
>
> 1092
> 00:40:25,710 --> 00:40:29,220
> the kid has<font color="#E5E5E5"> no idea what you mean</font>
>
> 1093
> 00:40:27,119 --> 00:40:31,019
> <font color="#E5E5E5">because you don't know what you mean</font><font 
> color="#CCCCCC"> but</font>
>
> 1094
> 00:40:29,219 --> 00:40:33,149
> you mean something<font color="#CCCCCC"> like well don't be a</font>
>
> 1095
> 00:40:31,019 --> 00:40:35,369
> pain<font color="#E5E5E5"> in the neck</font><font color="#CCCCCC"> don't 
> be a diva don't</font>
>
> 1096
> 00:40:33,150 --> 00:40:36,930
> be a don't be narcissistic<font color="#CCCCCC"> and something</font>
>
> 1097
> 00:40:35,369 --> 00:40:39,059
> <font color="#CCCCCC">like</font><font color="#E5E5E5"> that but 
> you</font><font color="#CCCCCC"> actually mean</font>
>
> 1098
> 00:40:36,929 --> 00:40:42,089
> something<font color="#E5E5E5"> really important what you mean</font>
>
>

-- 
This is the BBEdit Talk public discussion group. If you have a 
feature request or need technical support, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <https://www.twitter.com/bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.

Reply via email to