[go-nuts] How to speed up execution time for a set of regular expressions

2016-12-15 Thread Damian Gryski
I wrote a quick tutorial on using ragel to speed up matching regular 
expressions in Go.

https://medium.com/@dgryski/speeding-up-regexp-matching-with-ragel-4727f1c16027

I had planned on writing a few more I n using some of ragels other features for 
matching things.

I have an example of a ragel-based lexer in https://github.com/dgryski/dpc

Hope this helps,

Damian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] How to speed up execution time for a set of regular expressions

2016-12-13 Thread Andy Balholm
Right. Aho-Corasick can’t be used directly in that case.

But it might still be a major performance win to use Aho-Corasick for the first 
pass, and then confirm with regexes. In other words, if the Aho-Corasick stage 
finds “associate,” then check whether /associate.*with/ matches.

Andy

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] How to speed up execution time for a set of regular expressions

2016-12-13 Thread David Sofo
Thank you Andy for your reply, I can have optional classes like (B1|B2|B3)? 
and some keywords are multiword expression it can have some  words within 
its parts. Exemple: *associate … with, **protect … from. Can *Aho-Corasick 
string matching used for this task. If I understood Aho-Corasick string 
matching is for only fixed keywords.

Le mardi 13 décembre 2016 17:31:58 UTC+1, Andy Balholm a écrit :
>
> If it’s actually just a list of keywords (no wildcards, character ranges, 
> etc.), I would recommend using Aho-Corasick string matching rather than 
> regular expressions.
>
> Andy
>
> On Dec 13, 2016, at 7:53 AM, David Sofo  
> wrote:
>
>  Hi,
>
> For a set of rules expressed in regular expression (around 1000 rules 
> expected) to find some keywords in a text file (~50Ko each file), how to 
> speed up the execution time. Currently I compile the regex rule at 
> initialization time with init function at put them in a map at package 
> level then run the regex rules with a loop. The regex have this form:
>
> \b(?:( (A1|A2|A3) | (B1|B2|B3) ) )\b
>
> spaces are put for readability. A and B are classes of keywords.
>
> How to speed up the execution: at regular expression level or others 
> levels (such execution priority). I am using Ubuntu 14.04. Any suggestion 
> is welcome.  Thank you.
>
> Here a code
>
> Regards
> David
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts...@googlegroups.com .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] How to speed up execution time for a set of regular expressions

2016-12-13 Thread Andy Balholm
If it’s actually just a list of keywords (no wildcards, character ranges, 
etc.), I would recommend using Aho-Corasick string matching rather than regular 
expressions.

Andy

> On Dec 13, 2016, at 7:53 AM, David Sofo  wrote:
> 
>  Hi,
> 
> For a set of rules expressed in regular expression (around 1000 rules 
> expected) to find some keywords in a text file (~50Ko each file), how to 
> speed up the execution time. Currently I compile the regex rule at 
> initialization time with init function at put them in a map at package level 
> then run the regex rules with a loop. The regex have this form:
> 
> \b(?:( (A1|A2|A3) | (B1|B2|B3) ) )\b
> 
> spaces are put for readability. A and B are classes of keywords.
> 
> How to speed up the execution: at regular expression level or others levels 
> (such execution priority). I am using Ubuntu 14.04. Any suggestion is 
> welcome.  Thank you.
> 
> Here a code
> 
> Regards
> David
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com 
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] How to speed up execution time for a set of regular expressions

2016-12-13 Thread David Sofo
 Hi,

For a set of rules expressed in regular expression (around 1000 rules 
expected) to find some keywords in a text file (~50Ko each file), how to 
speed up the execution time. Currently I compile the regex rule at 
initialization time with init function at put them in a map at package 
level then run the regex rules with a loop. The regex have this form:

\b(?:( (A1|A2|A3) | (B1|B2|B3) ) )\b

spaces are put for readability. A and B are classes of keywords.

How to speed up the execution: at regular expression level or others levels 
(such execution priority). I am using Ubuntu 14.04. Any suggestion is 
welcome.  Thank you.

Here a code

Regards
David

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.