Re: [perl #127064] Variable interpolation in regex very slow

2015-12-31 Thread Jules Field



On 29/12/2015 23:05, Timo Paulssen via RT wrote:

On 12/29/2015 12:46 AM, Jules Field (via RT) wrote:

# New Ticket Created by  Jules Field
# Please include the string:  [perl #127064]
# in the subject line of all future correspondence about this issue.
# https://rt.perl.org/Ticket/Display.html?id=127064 >


Given
my @lines = "some-text.txt".IO.lines;
my $s = 'Jules';
(some-text.txt is about 43k lines)

Doing
my @matching = @lines.grep(/ $s /);
is about 50 times slower than
my @matching = @lines.grep(/ Jules /);

And if $s happened to contain anything other than literals, so I had to us
my @matching = @lines.grep(/ <$s> /);
then it's nearly 150 times slower.

my @matching = @lines.grep($s);
doesn't appear to work. It matches 0 lines but doesn't die.

The lack of Perl5's straightforward variable interpolation in regexs is 
crippling the speed.
Is there a faster alternative? (other than EVAL to build the regex)


For now, you can use @lines.grep(*.contains($s)), which will be
sufficiently fast.

Ideally, our regex optimizer would turn this simple regex into a code
that uses .index to find a literal string and construct a match object
for that. Or even - if you put a literal "so" in front - turn it into
.contains($literal) if it knows that the match object will only be
inspected for true/false.

Until then, we ought to be able to make interpolation a bit faster.
   - Timo

Many thanks for that. I hadn't thought to use Whatever.

I would ideally also be doing case-insensitive regexps, but they are 50 
times slower than case-sensitive ones, even in trivial cases.
Maybe a :adverb for rx// that says "give me static (i.e. Perl5-style) 
interpolation in this regex"?
I can see the advantage of passing the variables to the regex engine, as 
then they can change over time.


But that's not something I want to do very often, far more frequently I 
just need to construct the regex at run-time and have it go as fast as 
possible.


Just thoughts from a big Perl5 user (e.g. MailScanner is 50k lines of it!).

Jules

--
ju...@jules.uk
Twitter: @JulesFM

'If I were a Brazilian without land or money or the means to feed
 my children, I would be burning the rain forest too.' - Sting


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: [perl #127064] Variable interpolation in regex very slow

2015-12-29 Thread Timo Paulssen
On 12/29/2015 12:46 AM, Jules Field (via RT) wrote:
> # New Ticket Created by  Jules Field 
> # Please include the string:  [perl #127064]
> # in the subject line of all future correspondence about this issue. 
> # https://rt.perl.org/Ticket/Display.html?id=127064 >
>
>
> Given
>my @lines = "some-text.txt".IO.lines;
>my $s = 'Jules';
> (some-text.txt is about 43k lines)
>
> Doing
>my @matching = @lines.grep(/ $s /);
> is about 50 times slower than
>my @matching = @lines.grep(/ Jules /);
>
> And if $s happened to contain anything other than literals, so I had to us
>my @matching = @lines.grep(/ <$s> /);
> then it's nearly 150 times slower.
>
>my @matching = @lines.grep($s);
> doesn't appear to work. It matches 0 lines but doesn't die.
>
> The lack of Perl5's straightforward variable interpolation in regexs is 
> crippling the speed.
> Is there a faster alternative? (other than EVAL to build the regex)
>

For now, you can use @lines.grep(*.contains($s)), which will be
sufficiently fast.

Ideally, our regex optimizer would turn this simple regex into a code
that uses .index to find a literal string and construct a match object
for that. Or even - if you put a literal "so" in front - turn it into
.contains($literal) if it knows that the match object will only be
inspected for true/false.

Until then, we ought to be able to make interpolation a bit faster.
  - Timo


[perl #127064] Variable interpolation in regex very slow

2015-12-29 Thread via RT
# New Ticket Created by  Jules Field 
# Please include the string:  [perl #127064]
# in the subject line of all future correspondence about this issue. 
# https://rt.perl.org/Ticket/Display.html?id=127064 >


Given
   my @lines = "some-text.txt".IO.lines;
   my $s = 'Jules';
(some-text.txt is about 43k lines)

Doing
   my @matching = @lines.grep(/ $s /);
is about 50 times slower than
   my @matching = @lines.grep(/ Jules /);

And if $s happened to contain anything other than literals, so I had to us
   my @matching = @lines.grep(/ <$s> /);
then it's nearly 150 times slower.

   my @matching = @lines.grep($s);
doesn't appear to work. It matches 0 lines but doesn't die.

The lack of Perl5's straightforward variable interpolation in regexs is 
crippling the speed.
Is there a faster alternative? (other than EVAL to build the regex)

-- 
ju...@jules.uk