On Thu, Dec 08, 2011 at 10:09:20AM -0700, Doug McNutt wrote:
> The [ \t]* (space and tab) sequences seem to be too much and too often.
> As I see the data there are no tabs in it though that might be an email
> thing.  Just using ' *' (\ *), (escaped space star) might be a
> significant reduction in the work required by the regex software.

In the email it's all spaces, but the original file could very well have
tabs there, so I allowed for it.  Changing [ \t]* to \ * simplifies the
regex slightly, but won't reduce the amount of backtracking in this case.


> Those asterisks in the detailed trip lines don't seem to be specifically
> allowed for. The * character has a special meaning in the regex and
> somehow the *'s in the text file must be allowed for in one of the .*\r
> (anything repeated until a return) expressions. Do they need to be tagged
> as not greedy when the regex is dealing with line ends internally?

The asterisks don't need to be specifically accounted for.  They're matched
by .* just like all the other characters on those lines.  :)

It's not necessary to make .*\r non-greedy.  Because . cannot match \r
anyway, there is no difference between .*\r and .*?\r; they both match up
to the first \r.


> The lines to be counted begin with three spaces and end with two spaces
> and a digit.  That looks pretty easy to count in a line by line reader.
> Counting the asterisks globally might also be a quick route to the
> desired result.

I don't think David has told us what the asterisks mean, so we don't know
if they'd be useful for getting the desired result.  :)


> If you give up because BBedit is overflowing its stack I can help with a
> perl script but I would read the file line by line and collect data in
> variables devoted to the task at hand. Perhaps a preprocessor that
> creates an output file that is easier to work with in BBEdit or a
> spreadsheet.

FWIW, the regex should work fine in Perl, which has more robust handling of
the regex stack.

Of course, a script that reads line by line would be a fine solution as
well.  And in some ways a better solution; for example, you could organize
the trips into separate lists in a single pass through the file.

Ronald

-- 
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

Reply via email to