I use Perl for heavy duty text processing. A question on Perl Monks
about Perl 5's handling of a large input file got me wondering how the
two Perls compare at the moment.

I wrote a couple of simple programs, in both languages, to write and
read a 10 Gb text file filled with identical 100-character lines. The
reading programs counted total lines and characters of the input file.
The results on my fastest host show that much optimization is still
needed for Perl 6.

I compared read times for file sizes from one to 10 Gb in one-gigabyte
increments and, in general, Perl 6 takes roughly 30 times longer than
Perl 5.14 to read the same file.  So far I see no significant
improvement in Rakudo 2016.01 over 2015.12, but the tests haven't
quite finished yet.

When I use the stats incantation shown by Liz, I get:

$ time perl6 --stagestats read-file-test.p6 large-1-gb-file.txt
Stage start      :   0.000
Stage parse      :   0.160
Stage syntaxcheck:   0.000
Stage ast        :   0.000
Stage optimize   :   0.005
Stage mast       :   0.021
Stage mbc        :   0.000
Stage moar       :   0.000
  File 'large-1-gb-file.txt' size: 1000000000 bytes
  Normal end.
  For input file 'large-1-gb-file.txt':
    Number lines: 10000000
    Number chars: 1000000000

real 2m8.585s
user 2m5.408s
sys 0m0.968s

It looks to me that there are no stage hotspots, just overall
optimization with improvements to be done.

Without the stats I get for Perl 5 (5.14):
--------------------------------------------------------

$ time perl read-file-test.pl large-1-gb-file.txt
  File 'large-1-gb-file.txt' size: 1000000000 bytes
  Normal end.
  For input file 'large-1-gb-file.txt':
    Number lines: 10000000
    Number chars: 1000000000

real 0m6.216s
user 0m4.784s
sys 0m0.328s

And for Perl 6 (2016.01.1) I get:
---------------------------------------------

$ time perl6 read-file-test.p6 large-1-gb-file.txt
  File 'large-1-gb-file.txt' size: 1000000000 bytes
  Normal end.
  For input file 'large-1-gb-file.txt':
    Number lines: 10000000
    Number chars: 1000000000

real 2m6.687s
user 2m4.216s
sys 0m0.588s

I tried the suggestion from Bart Wiegmans to compile the program:

$ perl6 --target=mbc --output=read-file-test.moarvm read-file-test.p6
$ time perl6 read-file-test.moarvm large-1-gb-file.txt
Error while reading from file: Malformed UTF-8

So I guess precompilation is not yet ready for public testing.  That
will be a nice feature, IMHO!

Cheers!

-Tom

Reply via email to