On Thu, Jun 30, 2022 at 6:19 PM <i...@apache.org> wrote: > > Author: ivan > Date: Thu Jun 30 16:18:51 2022 > New Revision: 1902375 > > URL: http://svn.apache.org/viewvc?rev=1902375&view=rev > Log: > On 1.8.x branch: Merge r1785072, r1788930 from trunk: > *) apr_file_gets: Optimize for buffered files on Windows. > [Evgeny Kotkov <evgeny.kotkov visualsvn.com>] [] > > --- apr/apr/branches/1.8.x/file_io/win32/readwrite.c (original) > +++ apr/apr/branches/1.8.x/file_io/win32/readwrite.c Thu Jun 30 16:18:51 2022 > @@ -141,6 +141,56 @@ static apr_status_t read_with_timeout(ap > return rv; > } > > +static apr_status_t read_buffered(apr_file_t *thefile, void *buf, apr_size_t > *len) > +{ > + apr_status_t rv; > + char *pos = (char *)buf; > + apr_size_t blocksize; > + apr_size_t size = *len; > + > + if (thefile->direction == 1) { > + rv = apr_file_flush(thefile); > + if (rv != APR_SUCCESS) { > + return rv; > + } > + thefile->bufpos = 0; > + thefile->direction = 0; > + thefile->dataRead = 0; > + } > + > + rv = 0; > + while (rv == 0 && size > 0) { > + if (thefile->bufpos >= thefile->dataRead) { > + apr_size_t read; > + rv = read_with_timeout(thefile, thefile->buffer, > + thefile->bufsize, &read); > + if (read == 0) { > + if (rv == APR_EOF) > + thefile->eof_hit = TRUE; > + break; > + } > + else { > + thefile->dataRead = read; > + thefile->filePtr += thefile->dataRead; > + thefile->bufpos = 0; > + } > + } > + > + blocksize = size > thefile->dataRead - thefile->bufpos ? > thefile->dataRead - thefile->bufpos : size; > + memcpy(pos, thefile->buffer + thefile->bufpos, blocksize); > + thefile->bufpos += blocksize; > + pos += blocksize; > + size -= blocksize; > + }
Why would buffered reads always be full reads (i.e. until the given size of EOF)? For apr_file_read() I don't think this is expected, users might want read() to return what is there when it's there (i.e. they handle short reads by themselves), so implicit full reads will force them to use small buffers artificially and they'll lose in throughput. On the other hand, users that want full reads can do that already/currently in their own code (by calling read() multiple times). For apr_file_gets() this is the same issue, why when some read returns the \n would we need to continue reading to fill the buffer? In a pipe scenario the peer would apr_file_write("Hello\n") and the consumer's apr_file_read() would not return that to the user immediately? This has nothing to do with buffered vs non-buffered IMHO, there are apr_file_{read,write}_full() already for that. I don't know if nonblocking reads take this path too, but if so this also causes a second/useless read (because nonblocking are often short reads) that is almost sure to return EAGAIN (or windows equivalent), and the call would return EGAIN with *len > 0? > + > + *len = pos - (char *)buf; > + if (*len) { > + rv = APR_SUCCESS; > + } > + > + return rv; > +}