On Fri, 29 Oct 2021 at 00:46, yary <[email protected]> wrote:
> A small thing to begin with in the regex m/ ^ (@attributes) ':' \s (.+)
> $ /;
> m/ ^ (@attributes) ': ' (.*) $ /;
>
Yes, nice cleanup. Thanks.
> Next, how about adding a 2nd regex test similar to the "split" that also
> relies on User ignoring unknown fields? This accepts an empty-string key,
> which the "split" string handler does too.
>
> m/ ^ (<-[:]>*) ': ' (.*) /;
>
$ ./icheck.raku regex2
41391 entries by regex2 in 4.615332639 seconds
Woh! That was surprising. The new regex is only about 2x slower than the
"split" method.
I did read on SO that someone claimed " longest-match alternation of the
list's elements" is slow.
But I thought the conclusion in the answers was that, in general, regex's
are slow.
Might have to test this example again on 2021.10 (not easy for me).
>>> Results for rakudo-pkg-2021.9.0-01:
>>> $ ./icheck.raku regex
>>> 41391 entries by regex in 27.859560887 seconds
>>> $ ./icheck.raku starts
>>> 41391 entries by starts-with in 5.970667533 seconds
>>> $ ./icheck.raku split
>>> 41391 entries by split in 5.12252741 seconds
>>>
>>> Results for rakudo-pkg-2021.10.0-01
>>> $ ./icheck.raku regex
>>> 41391 entries by regex in 27.833870158 seconds
>>> $ ./icheck.raku starts
>>> 41391 entries by starts-with in 2.560101599 seconds
>>> $ ./icheck.raku split
>>> 41391 entries by split in 2.307679407 seconds
>>>
>>>
--------------------------------------------------
#!/usr/bin/env raku
class User {
has $.uid;
has $.uidNumber;
has $.gidNumber;
has $.homeDirectory;
has $.mode = 0;
method attributes {
# return <uid uidNumber gidNumber homeDirectory mode>;
User.^attributes(:local)>>.name>>.substr(2); # Is the order
guaranteed?
}
}
# Read user info from LDIF file
my %ldap;
my @attributes = User.attributes;
multi MAIN ( "regex", $ldif-fn = "db/icheck.ldif" ) {
my ( %f );
for $ldif-fn.IO.lines -> $line {
when not $line { # blank line is LDIF entry terminator
%ldap{%f<uid>} = User.new( |%f );
}
when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new
entry
next unless $line ~~ m/ ^ (@attributes) ': ' (.*) $ /;
%f{$0} = "$1";
}
say "{%ldap.elems} entries by regex in {now - BEGIN now} seconds";
}
multi MAIN ( "regex2", $ldif-fn = "db/icheck.ldif" ) {
my ( %f );
for $ldif-fn.IO.lines -> $line {
when not $line { # blank line is LDIF entry terminator
%ldap{%f<uid>} = User.new( |%f );
}
when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new
entry
next unless $line ~~ m/ ^ (<-[:]>*) ': ' (.*) $ /;
%f{$0} = "$1";
}
say "{%ldap.elems} entries by regex2 in {now - BEGIN now} seconds";
}
multi MAIN ( "starts", $ldif-fn = "db/icheck.ldif" ) {
my ( %f );
for $ldif-fn.IO.lines -> $line {
when not $line { # blank line is LDIF entry terminator
%ldap{%f<uid>} = User.new( |%f );
}
when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new
entry
for @attributes -> $a {
if $line.starts-with( $a ~ ": " ) {
%f{$a} = (split( ": ", $line, 2))[1];
last;
}
}
}
say "{%ldap.elems} entries by starts-with in {now - BEGIN now} seconds";
}
multi MAIN ( "split", $ldif-fn = "db/icheck.ldif" ) {
my ( %f, $k, $v );
for $ldif-fn.IO.lines -> $line {
when not $line { # blank line is LDIF entry terminator
%ldap{%f<uid>} = User.new( |%f ); # attributes not used
are ignored
}
when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new
entry
($k, $v) = split( ": ", $line, 2);
%f{$k} = $v;
}
say "{%ldap.elems} entries by split in {now - BEGIN now} seconds";
}
--
Norman Gaywood, Computer Systems Officer
School of Science and Technology
University of New England
Armidale NSW 2351, Australia
[email protected] http://turing.une.edu.au/~ngaywood
Phone: +61 (0)2 6773 2412 Mobile: +61 (0)4 7862 0062
Please avoid sending me Word or Power Point attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html