Re: [Boston.pm] Q: giant-but-simple regex efficiency

Greg London Fri, 04 Feb 2011 17:15:23 -0800

I wrote Parse::Gnaw to parse a file a little bit at a time using a realgrammer. I was dealing with multi gigabyte text files that represented agate level version of an ASIC.As long as the text you are parsing can be defined in chunks and you canflush at the end of a chunk then Parse::Gnaw ought to parse your text. Ifyou just want to capture some identifiers, it supports the equivalent ofcapturing parens.


as  for speed, its about three or four times faster than Parse::RecDescent.

Greg London

-----Original message-----
From: Kripa Sundar <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Sat, Feb 5, 2011 00:53:35 GMT+00:00
Subject: Re: [Boston.pm] Q: giant-but-simple regex efficiency

Thanks for the prompt replies, folks!

Unfortunately, my names can be embedded in larger "words" of the input
text, as long as they are delimited by certain punctuation.

If I can figure out all of the permitted punctuation, I will try out
a split() and a hash lookup.

But Regex::Trie seems more likely to help.  (I guess I was hoping that
the Perl regex compiler would automatically do that kind of optimal
regex construction, without the need for a module.)

peace,          || Finding gifts that do not harm:
--{kr.pA}       || http://www.dailygood.org/more.php?n=3159
--

It might look like I'm idle, but at the cellular level I'm really quitebusy.


_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm


_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Re: [Boston.pm] Q: giant-but-simple regex efficiency

Reply via email to