On Fri, 15 Oct 2010 02:01:55 +0400, Andrei Alexandrescu <seewebsiteforem...@erdani.org> wrote:

On 10/14/10 14:00 CDT, Denis Koroskin wrote:
On Thu, 14 Oct 2010 22:22:07 +0400, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:

On 10/14/10 12:56 CDT, Denis Koroskin wrote:
appendDelim *requires* buffering for to be implemented. No OS provides
an API to read from a file (be it pipe, socket, whatever) to read up to
some abstract delimiter. It *always* reads in blocks.

Clear. What may be not so clear is that read(ubyte[] buf) ALSO
requires buffering. Disk I/O comes in fixed buffer sizes (sometimes
aligned at 512 bytes or whatever), so ANY protocol that allows the
user to set the maximum bytes to read will require buffering and
copying. So how is appendDelim worse than read?

As such, if you
need to read until a delimeter, you need to fetch block to some internal
buffer, MANUALLY search through it and THEN copy to output string.

And there's no way for the client to efficiently do that.

I've
implemented that on top of chunked read interface, and it was 5% faster than getline()/getdelim() that GNU libc provides (despite you claming it
to be "many times faster"). It's not.

Please post your code.


http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=119248

I meant the baseline.

Buffering requires and additional level of data copying, and this is bad
for fast I/O.

Agreed. But then you define routines that also requires buffering. How
do you reconcile your own requirement with your own interface?


My interface doesn't require any additional copying. You only copy when
you need to buffer something, but in general you don't. My Stream
interface is simply a thin portable layer on top of OS. See the code
above for simple implementation that is built on top of fopen/fread (I
used open/read initially but it gave 0 improvement so I went back to
fopen/fread because GNU libc line-input uses them, too, so that would be
the most fair comparison). It can't be any more efficient than that.

If you need fast I/O or must pull that out of the stream
interface. Otherwise chunked read will be less efficient due to
additional copies to and from buffers.

On the contrary line-based reading can be implemented on top of the
chunked read without sacrificing a tiny bit of efficiency.

Except for extra copying.

appendDelim implementation:

1. Low-level read in internal buffers

2. Search for delimiter (assume found for simplicity)

3. Resize user buffer

4. Copy

That's one copy, with the necessary corner cases when the delimiter
isn't found yet etc. (which increase copying ONLY if the buffer is
actually moved when reallocated).

The implementation in your message on 10/13/2010 21:20 CDT:

1. Low-level read in internal buffers

2. Copy from internal buffers into the internal buffer provided by
your ByLine implementation

3. Copy from the internal buffer of ByLine into the user-supplied buffer

That's two copies. Agreed?


Andrei

I'm not sure what message are you talking about (first or second one).
Second one
(http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=119248)
makes a chunked read to internal buffer (if not filled yet), then
searches for a delimiter and then copies to a user-provided buffer.
That's one copy in most cases. And that's what GNU libc does, too.

Your function calls fopen() and does not disable buffering by means of setvbuf(). By default fopen() opens in buffered mode. Does the existence of that buffer entail an extra copy?

Andrei


In my original version there was a setbuf(f, null) call. I removed it because it had 0 impact on performance. I also tried using unistd open/read functions, that had zero impact, too (btw, opening file with O_DIRECT returned valid file descriptor, but read operations very failing with an invalid argument error).

Reply via email to