David sent me a long test file off-list. Here's an improved regex that
works in BBEdit:
(?mx)
(^THIS\ IS\ TRIP\ NUMBER.*\r
(?:\S.*\r)*
\r
(?>(?:
(?:[ \t]*\d.*\r)+
[ \t]*-----+.*\r
)*)
(?:[ \t]*\d.*\r){3}
[ \t]*\(.*\rTHE\ CALCULATED\ CREW\ COST.*\r
)
I just had to add (?> ) around the middle part of the regex. The regex
engine won't backtrack into (?> ) once the (?> ) matches. This saves the
engine from doing a lot of extra work when the whole regex isn't going to
match anyway.
Although the original regex does work in Perl, it is affected by the
excessive backtracking there as well. On David's test file, for example, a
Perl script using the original regex takes ~45 seconds to find all matches,
while using the above improved regex takes less than 0.1 seconds!
Ronald
--
You received this message because you are subscribed to the
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem,
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>