Re: Looping regexes against a list

2015-01-20 Thread Andrew Solomon
Hey Mike Aside from this lengthy rant^H^H^H^H discussion:) about where you put your regex, have you made any progress on the performance problem you put forward at the outset? cheers Andrew On Tue, Jan 20, 2015 at 6:41 AM, Danny Spell ddsp...@gmail.com wrote: For me, regex can be simple or

Re: Looping regexes against a list

2015-01-20 Thread Shawn H Corey
What the OP should do is put the regexes in a Perl module and unroll the loop. That way, he can group them so that groups of tests are skipped: # strings containing foo if( /foo/ ){ return food if /food/; return fool if /fool/; return foot if /foot/; return foo; } -- Don't

Re: Looping regexes against a list

2015-01-20 Thread Brandon McCaig
On Tue, Jan 20, 2015 at 10:21 AM, Brandon McCaig bamcc...@gmail.com wrote: perldoc -f qr// I was sure that worked in my up-to-date perlbrew environments, but it isn't working in Cygwin running perl 5.14.2 so in the event that it doesn't work for you look at `perldoc -f qr' and `perldoc perlop'

Re: Looping regexes against a list

2015-01-20 Thread Shawn H Corey
On Tue, 20 Jan 2015 10:23:42 -0500 Brandon McCaig bamcc...@gmail.com wrote: On Tue, Jan 20, 2015 at 10:21 AM, Brandon McCaig bamcc...@gmail.com wrote: perldoc -f qr// I was sure that worked in my up-to-date perlbrew environments, but it isn't working in Cygwin running perl 5.14.2 so in

Re: Looping regexes against a list

2015-01-20 Thread Charles DeRykus
On Tue, Jan 20, 2015 at 9:47 AM, Mike Martin m...@redtux.org.uk wrote: Thanks for the idea about qr, I did try this before, but I've now relooked at at it and got about 75% improvement. As regards the uninitialized point the errors were coming from regexes (different ones) when the regex

Re: Looping regexes against a list

2015-01-20 Thread Brandon McCaig
On Tue, Jan 20, 2015 at 9:51 AM, Andrew Solomon and...@geekuni.com wrote: Aside from this lengthy rant^H^H^H^H discussion:) about where you put your regex, have you made any progress on the performance problem you put forward at the outset? I'm not quite sure that I understand what the OP is

Re: Looping regexes against a list

2015-01-20 Thread Mike Martin
Thanks for the idea about qr, I did try this before, but I've now relooked at at it and got about 75% improvement. As regards the uninitialized point the errors were coming from regexes (different ones) when the regex wasnt matching, so testing the result of each regex match was not really an

Re: Looping regexes against a list

2015-01-20 Thread Shawn H Corey
On Tue, 20 Jan 2015 17:47:58 + Mike Martin m...@redtux.org.uk wrote: Take a load of Job Vacancy posts (xml files - loads of) Parse the Information, getting rid of as much garbage as possible Push a distinct list into a lookup hash If you're running Linux (or any POSIX), see `man sort` and

Re: Looping regexes against a list

2015-01-19 Thread Danny Spell
For me, regex can be simple or complex. It depends on the task at hand. The more complex the task, the more complex the regex. My boss who can code, but doesn't want to, *HATES* regex. Personally, I think it is pretty powerful and I'm grateful for its flexibility. I use it daily in my job where

Looping regexes against a list

2015-01-19 Thread Mike Martin
Hi I am looking for the most performant way to achieve this I have a big list of text (47+ lines in a hash) I then run a hash ref consisting of replacement text - pattern to search - optional 3rd param for grouping matches So I loop through the text and then loop the regex hash against each

Re: Looping regexes against a list

2015-01-19 Thread Brandon McCaig
Mike: On Mon, Jan 19, 2015 at 01:25:56PM +, Mike Martin wrote: Hi Hello, I am looking for the most performant way to achieve this I have a big list of text (47+ lines in a hash) I then run a hash ref consisting of replacement text - pattern to search - optional 3rd param for

Re: Looping regexes against a list

2015-01-19 Thread Mike Martin
The lookup hash is like this %clean=( HeatingEngineer = (?:Heating.*?Engineer)\b.*? HGV Driver=(?=\A(?:(?!tech|mech).)*$)(?:HGV|LGV|Class.?1|Class.?2).?(?:1|2|3|)(?:.+Driver|).*? HGV Mechanic= (?:(?:HGV|LGV|Lorry).+(?:Mech?anics?|technicians?))\b.*? Highway Engineer= (?:(?:Highway.?)

Re: Looping regexes against a list

2015-01-19 Thread Shawn H Corey
On Mon, 19 Jan 2015 15:57:35 + Andrew Solomon and...@geekuni.com wrote: On Mon, Jan 19, 2015 at 2:50 PM, Mike Martin m...@redtux.org.uk wrote: The lookup hash is like this %clean=( HeatingEngineer = (?:Heating.*?Engineer)\b.*? HGV

Re: Looping regexes against a list

2015-01-19 Thread Andrew Solomon
First question which comes to mind is - do you really need to call sort in those two foreach collections? Sort can often take time, and the fact that you've got one sort nested inside a foreach loop is bad karma:) Andrew On Mon, Jan 19, 2015 at 2:50 PM, Mike Martin m...@redtux.org.uk wrote: The

Re: Looping regexes against a list

2015-01-19 Thread John Mason
On Mon, Jan 19, 2015 at 11:36 AM, Shawn H Corey shawnhco...@gmail.com wrote: On Mon, 19 Jan 2015 11:18:01 -0500 bill pemberton wape...@gmail.com wrote: I fail to see why a regex can only be changed by a programmer. please expand upon this. You shouldn't blindly trust input from a user,

Re: Looping regexes against a list

2015-01-19 Thread Shawn H Corey
On Mon, 19 Jan 2015 11:18:01 -0500 bill pemberton wape...@gmail.com wrote: I fail to see why a regex can only be changed by a programmer. please expand upon this. Regex is a programming language in its own right. Why should an average user have any knowledge of it? -- Don't stop where the

Re: Looping regexes against a list

2015-01-19 Thread Shawn H Corey
On Mon, 19 Jan 2015 11:43:41 -0500 John Mason john.mason...@gmail.com wrote: On Mon, Jan 19, 2015 at 11:36 AM, Shawn H Corey shawnhco...@gmail.com wrote: On Mon, 19 Jan 2015 11:18:01 -0500 bill pemberton wape...@gmail.com wrote: I fail to see why a regex can only be changed by a