Another "occasional user" question - with not enough time to learn all the cool tools bbedit and regex offer that would solve my problem. context: I have about 500 KB of text (as 43 .txt files) video transcripts to edit/refine. All files have timecode removed.
I want to find all instances of doubled words, but omit/ignore a subset of those matches, i.e., search for doubled words in a video transcript, but *EXCLUDE "many, many" and "very, very"*. In effect, this will reduce instances of stuttering in a video transcript, but leave the intentional repeats intact. This search string finds doubled words separated by a comma and a space, which satisfies most of the instances of doubled words: (\b[A-Za-z]+\b),\s\1 replace with \1 e.g., find "*what, what*" and replace with "*what*" in the string: *So when we talk about the structure of data that describes what, what identifies our columns* But do not replace "*very, very*" in the string: *or even the greater distance away from zero, is **very, very **small. * Thank you for any hints on doing this. Glenn -- This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "[email protected]" rather than posting here. Follow @bbedit on Mastodon: <https://mastodon.social/@bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/bbedit/e1806b72-be8d-4cf4-a263-148b9214828bn%40googlegroups.com.
