Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread Tom Browder
On Tue, Mar 29, 2016 at 10:29 PM, Timo Paulssen  wrote:
> On 03/30/2016 03:45 AM, Timo Paulssen wrote:
>
> Could you try using $filename.IO.slurp.lines instead of $filename.IO.lines
> and see if that makes things any faster?
>   - Timo
>
>
> Actually, the method on an IO::Handle is called "slurp-rest"; slurp would
> only work with a filename instead.
>   - Timo

Okay, I've done a comparison of the three methods on a 1 Gb file:

IO.lines
  real 2m11.827s
  user 2m10.036s
  sys 0m1.468s

IO.split
  real 1m51.504s
  user 1m51.136s
  sys 0m0.352s

IO.slurp-rest
  real 2m9.821s
  user 2m6.268s
  sys 0m3.532s

and Perl 5:

  real 0m4.614s
  user 0m4.328s
  sys 0m0.280s

Best,

-Tom


Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread James E Keenan

On 03/30/2016 04:11 PM, yary wrote:

On Wed, Mar 30, 2016 at 3:20 PM, Elizabeth Mattijsen  wrote:

Thanks for your thoughts!

I’ve implemented $*DEFAULT-READ-ELEMS in 
https://github.com/rakudo/rakudo/commit/5bd1e .

Of course, all of this is provisional, and open for debate and bikeshedding.





Yary, if you feel there's a need for this functionality in Perl *5* as 
well, please file a bug ticket via perlbug.


Thank you very much.
Jim Keenan


Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread yary
On Wed, Mar 30, 2016 at 3:20 PM, Elizabeth Mattijsen  wrote:
> Thanks for your thoughts!
>
> I’ve implemented $*DEFAULT-READ-ELEMS in 
> https://github.com/rakudo/rakudo/commit/5bd1e .
>
> Of course, all of this is provisional, and open for debate and bikeshedding.


Thanks! And that was fast!

Allowing DEFAULT-READ-ELEMS to be set from the environment's a good
idea that I hadn't thought of- since it is a machine-dependent
performance tweak, letting it be set outside the code is a good idea.

I had originally envisioned this as an "option" to "sub open" for
fine-grained control as to which IO::Handles got what
DEFAULT-READ-ELEMS, but I'm not sure it belongs there. After all it is
a performance-related tweak and I'm liking the idea of it being
primarily set from the environment; setting it in the code means
you're writing something for a particular host, don't need to change
the spec to support that.

Is there anything similar on the "write" side- output buffering- that
could use this treatment?

-y


Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread Elizabeth Mattijsen
> On 30 Mar 2016, at 16:06, yary  wrote:
> 
> Cross-posting to the compiler group-
> 
> On Wed, Mar 30, 2016 at 8:10 AM, Elizabeth Mattijsen  wrote:
>> If you know the line endings of the file, using 
>> IO::Handle.split($line-ending) (note the actual character, rather than a 
>> regular expression) might help.  That will read in the file in chunks of 64K 
>> and then lazily serve lines from that chunk.
> 
> This reminds me of a pet peeve I had with p5: Inability to easily
> change the default buffer size for reading & writing.
> 
> I'm the lone Perl expert at $work and at one point was trying to keep
> a file processing step in perl. These files were about 100x the size
> of the server's RAM, consisted of variable-length newline-terminated
> text, the processing was very light, there would be a few running in
> parallel. The candidate language, C#, has a text-file-reading object
> that lets you set its read-ahead buffer on creation/opening the file-
> can't remember the details. That size had a large impact on the
> performance of this task. With perl... I could not use the
> not-so-well-documented IO::Handle->setvbuf because my OS didn't
> support it. I did hack together something with sysread, but C# won in
> the end due partly to that.
> 
> It seems this "hiding-of-buffer" sub-optimal situation is being
> repeated in Perl6: neither https://doc.perl6.org/routine/open nor
> http://doc.perl6.org/type/IO::Handle mention a buffer, yet IO::Handle
> reads ahead and buffers. Experience shows that being able to adjust
> this buffer can help in certain situations. Also consider that perl5
> has defaulted to 4k and 8k, whereas perl6 is apparently using 64k, as
> evidence that this buffer needs to change as system builds evolve.
> 
> Please make this easily readable & settable, anywhere it's implemented!

Thanks for your thoughts!

I’ve implemented $*DEFAULT-READ-ELEMS in 
https://github.com/rakudo/rakudo/commit/5bd1e .

Of course, all of this is provisional, and open for debate and bikeshedding.


Liz

Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread Timo Paulssen

On 30/03/16 13:40, Tom Browder wrote:

On Tue, Mar 29, 2016 at 10:29 PM, Timo Paulssen  wrote:

On 03/30/2016 03:45 AM, Timo Paulssen wrote:

Could you try using $filename.IO.slurp.lines instead of $filename.IO.lines
and see if that makes things any faster?

...

Actually, the method on an IO::Handle is called "slurp-rest"; slurp would
only work with a filename instead.
   - Timo


Timo, I'm trying to test a situation where I could process every line
as it is read in.  The situation assumes the file is too large to
slurp into memory, thus the read of one line at a time.  So is there
another way to do that?  According to the docs "slurp-rest" gets all
the remaining file at one read.

Thanks,

Best regards,

-Tom


I was suggesting this mostly because we've recently discovered a very 
severe performance problem with IO.lines. I'd like to know if that also 
affects your benchmark and how big the saving might be for "moderately" 
sized data.


timo@schmand ~/p/e/SDL2_raw-p6 (master)> time perl6 -e 'for 
"heap-snapshot".IO.lines {}'
129.14user 0.87system 2:10.44elapsed 99%CPU (0avgtext+0avgdata 
507580maxresident)k


timo@schmand ~/p/e/SDL2_raw-p6 (master)> time perl6 -e 'for 
"heap-snapshot".IO.slurp.lines {}'
1.92user 0.14system 0:02.07elapsed 99%CPU (0avgtext+0avgdata 
537940maxresident)k


timo@schmand ~/p/e/SDL2_raw-p6 (master)> time perl6 -e 'for 
"heap-snapshot".IO.open.split("\n") {}'
192.04user 0.36system 3:12.70elapsed 99%CPU (0avgtext+0avgdata 
1350204maxresident)k


Hope this clears up how I meant that :)
  - Timo


Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread yary
Cross-posting to the compiler group-

On Wed, Mar 30, 2016 at 8:10 AM, Elizabeth Mattijsen  wrote:
> If you know the line endings of the file, using 
> IO::Handle.split($line-ending) (note the actual character, rather than a 
> regular expression) might help.  That will read in the file in chunks of 64K 
> and then lazily serve lines from that chunk.

This reminds me of a pet peeve I had with p5: Inability to easily
change the default buffer size for reading & writing.

I'm the lone Perl expert at $work and at one point was trying to keep
a file processing step in perl. These files were about 100x the size
of the server's RAM, consisted of variable-length newline-terminated
text, the processing was very light, there would be a few running in
parallel. The candidate language, C#, has a text-file-reading object
that lets you set its read-ahead buffer on creation/opening the file-
can't remember the details. That size had a large impact on the
performance of this task. With perl... I could not use the
not-so-well-documented IO::Handle->setvbuf because my OS didn't
support it. I did hack together something with sysread, but C# won in
the end due partly to that.

It seems this "hiding-of-buffer" sub-optimal situation is being
repeated in Perl6: neither https://doc.perl6.org/routine/open nor
http://doc.perl6.org/type/IO::Handle mention a buffer, yet IO::Handle
reads ahead and buffers. Experience shows that being able to adjust
this buffer can help in certain situations. Also consider that perl5
has defaulted to 4k and 8k, whereas perl6 is apparently using 64k, as
evidence that this buffer needs to change as system builds evolve.

Please make this easily readable & settable, anywhere it's implemented!


-y


Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread Elizabeth Mattijsen
> On 30 Mar 2016, at 13:40, Tom Browder  wrote:
> On Tue, Mar 29, 2016 at 10:29 PM, Timo Paulssen  wrote:
>> On 03/30/2016 03:45 AM, Timo Paulssen wrote:
>> 
>> Could you try using $filename.IO.slurp.lines instead of $filename.IO.lines
>> and see if that makes things any faster?
> ...
>> Actually, the method on an IO::Handle is called "slurp-rest"; slurp would
>> only work with a filename instead.
>>  - Timo
> Timo, I'm trying to test a situation where I could process every line
> as it is read in.  The situation assumes the file is too large to
> slurp into memory, thus the read of one line at a time.  So is there
> another way to do that?  According to the docs "slurp-rest" gets all
> the remaining file at one read.

That is correct.

The thing is that IO.lines basically depends on IO.get to get a line.  So that 
is extra overhead, that IO.slurp.lines doesn’t have.

If you know the line endings of the file, using IO::Handle.split($line-ending) 
(note the actual character, rather than a regular expression) might help.  That 
will read in the file in chunks of 64K and then lazily serve lines from that 
chunk.

A simple test on an /etc/dict/words:

$ 6 '"words".IO.lines.elems.say'
235886
real0m0.645s

$ 6 '"words".IO.open.split("\x0a").elems.say'
235887
real0m0.317s

Note that with .split you will get an extra empty line at the end.


Hope this helps.


Liz

Re: A practical benchmark shows speed challenges for Perl 6

2016-03-30 Thread Tom Browder
On Tue, Mar 29, 2016 at 10:29 PM, Timo Paulssen  wrote:
> On 03/30/2016 03:45 AM, Timo Paulssen wrote:
>
> Could you try using $filename.IO.slurp.lines instead of $filename.IO.lines
> and see if that makes things any faster?
...
> Actually, the method on an IO::Handle is called "slurp-rest"; slurp would
> only work with a filename instead.
>   - Timo


Timo, I'm trying to test a situation where I could process every line
as it is read in.  The situation assumes the file is too large to
slurp into memory, thus the read of one line at a time.  So is there
another way to do that?  According to the docs "slurp-rest" gets all
the remaining file at one read.

Thanks,

Best regards,

-Tom