Cross-posting to the compiler group-

On Wed, Mar 30, 2016 at 8:10 AM, Elizabeth Mattijsen <l...@dijkmat.nl> wrote:
> If you know the line endings of the file, using 
> IO::Handle.split($line-ending) (note the actual character, rather than a 
> regular expression) might help.  That will read in the file in chunks of 64K 
> and then lazily serve lines from that chunk.

This reminds me of a pet peeve I had with p5: Inability to easily
change the default buffer size for reading & writing.

I'm the lone Perl expert at $work and at one point was trying to keep
a file processing step in perl. These files were about 100x the size
of the server's RAM, consisted of variable-length newline-terminated
text, the processing was very light, there would be a few running in
parallel. The candidate language, C#, has a text-file-reading object
that lets you set its read-ahead buffer on creation/opening the file-
can't remember the details. That size had a large impact on the
performance of this task. With perl... I could not use the
not-so-well-documented IO::Handle->setvbuf because my OS didn't
support it. I did hack together something with sysread, but C# won in
the end due partly to that.

It seems this "hiding-of-buffer" sub-optimal situation is being
repeated in Perl6: neither https://doc.perl6.org/routine/open nor
http://doc.perl6.org/type/IO::Handle mention a buffer, yet IO::Handle
reads ahead and buffers. Experience shows that being able to adjust
this buffer can help in certain situations. Also consider that perl5
has defaulted to 4k and 8k, whereas perl6 is apparently using 64k, as
evidence that this buffer needs to change as system builds evolve.

Please make this easily readable & settable, anywhere it's implemented!


-y

Reply via email to