On Thu, Dec 08, 2011 at 10:09:20AM -0700, Doug McNutt wrote: > The [ \t]* (space and tab) sequences seem to be too much and too often. > As I see the data there are no tabs in it though that might be an email > thing. Just using ' *' (\ *), (escaped space star) might be a > significant reduction in the work required by the regex software.
In the email it's all spaces, but the original file could very well have tabs there, so I allowed for it. Changing [ \t]* to \ * simplifies the regex slightly, but won't reduce the amount of backtracking in this case. > Those asterisks in the detailed trip lines don't seem to be specifically > allowed for. The * character has a special meaning in the regex and > somehow the *'s in the text file must be allowed for in one of the .*\r > (anything repeated until a return) expressions. Do they need to be tagged > as not greedy when the regex is dealing with line ends internally? The asterisks don't need to be specifically accounted for. They're matched by .* just like all the other characters on those lines. :) It's not necessary to make .*\r non-greedy. Because . cannot match \r anyway, there is no difference between .*\r and .*?\r; they both match up to the first \r. > The lines to be counted begin with three spaces and end with two spaces > and a digit. That looks pretty easy to count in a line by line reader. > Counting the asterisks globally might also be a quick route to the > desired result. I don't think David has told us what the asterisks mean, so we don't know if they'd be useful for getting the desired result. :) > If you give up because BBedit is overflowing its stack I can help with a > perl script but I would read the file line by line and collect data in > variables devoted to the task at hand. Perhaps a preprocessor that > creates an output file that is easier to work with in BBEdit or a > spreadsheet. FWIW, the regex should work fine in Perl, which has more robust handling of the regex stack. Of course, a script that reads line by line would be a fine solution as well. And in some ways a better solution; for example, you could organize the trips into separate lists in a single pass through the file. Ronald -- You received this message because you are subscribed to the "BBEdit Talk" discussion group on Google Groups. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at <http://groups.google.com/group/bbedit?hl=en> If you have a feature request or would like to report a problem, please email "[email protected]" rather than posting to the group. Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
