Re: cat(1): always use a 64K buffer

2021-12-13 Thread Theo de Raadt
I think Todd is right, and this doesn't need to change.

Todd C. Miller  wrote:

> On Sun, 12 Dec 2021 19:15:51 -0600, Scott Cheloha wrote:
> 
> > cat(1) sizes its I/O buffer according to the st_blksize of the first
> > file it processes.  We don't do this very often in the tree.  I'm not
> > sure if we should trust st_blksize.
> 
> It sizes the buffer based on st_blksize of stdout, not the input
> file.  Since st_blksize is the "optimal" blocksize for that device.
> Since cat only has a single output, the output buffer only needs
> to be sized once.
> 
> > It would be simpler to just choose a value that works in practice and
> > always use it.
> 
> I prefer it the way it is now.  By using st_blksize you get the
> proper block size regardless of whether output is to a pipe, a file
> or a tty.  I didn't suggest this for tee since it supports multiple
> output files.
> 
>  - todd
> 



Re: cat(1): always use a 64K buffer

2021-12-13 Thread Todd C . Miller
On Sun, 12 Dec 2021 19:15:51 -0600, Scott Cheloha wrote:

> cat(1) sizes its I/O buffer according to the st_blksize of the first
> file it processes.  We don't do this very often in the tree.  I'm not
> sure if we should trust st_blksize.

It sizes the buffer based on st_blksize of stdout, not the input
file.  Since st_blksize is the "optimal" blocksize for that device.
Since cat only has a single output, the output buffer only needs
to be sized once.

> It would be simpler to just choose a value that works in practice and
> always use it.

I prefer it the way it is now.  By using st_blksize you get the
proper block size regardless of whether output is to a pipe, a file
or a tty.  I didn't suggest this for tee since it supports multiple
output files.

 - todd



Re: cat(1): always use a 64K buffer

2021-12-13 Thread Otto Moerbeek
On Mon, Dec 13, 2021 at 02:52:50AM -0600, Scott Cheloha wrote:

> > On Dec 13, 2021, at 01:13, Otto Moerbeek  wrote:
> > 
> > On Sun, Dec 12, 2021 at 07:15:51PM -0600, Scott Cheloha wrote:
> > 
> >> cat(1) sizes its I/O buffer according to the st_blksize of the first
> >> file it processes.  We don't do this very often in the tree.  I'm not
> >> sure if we should trust st_blksize.
> >> 
> >> It would be simpler to just choose a value that works in practice and
> >> always use it.
> >> 
> >> 64K works well.  We settled on that for tee(1) recently.
> > 
> > Why are you also change the allocation to be per-file? You might as
> > well go for a static buffer if you make it fixed size.
> 
> Ottomalloc does canary checks when free(3)
> is called if malloc_options is set up for it.
> I always thought that was useful when 
> auditing my code.

In this case (large allocation an exact multiple of the page size) any
overflow will be caught by the MMU, independent of malloc options.  So
no big gain here to be gained from calling free. I'd say keep the
original one time malloc call.

-Otto



Re: cat(1): always use a 64K buffer

2021-12-13 Thread Scott Cheloha
> On Dec 13, 2021, at 01:13, Otto Moerbeek  wrote:
> 
> On Sun, Dec 12, 2021 at 07:15:51PM -0600, Scott Cheloha wrote:
> 
>> cat(1) sizes its I/O buffer according to the st_blksize of the first
>> file it processes.  We don't do this very often in the tree.  I'm not
>> sure if we should trust st_blksize.
>> 
>> It would be simpler to just choose a value that works in practice and
>> always use it.
>> 
>> 64K works well.  We settled on that for tee(1) recently.
> 
> Why are you also change the allocation to be per-file? You might as
> well go for a static buffer if you make it fixed size.

Ottomalloc does canary checks when free(3)
is called if malloc_options is set up for it.
I always thought that was useful when 
auditing my code.


Re: cat(1): always use a 64K buffer

2021-12-12 Thread Otto Moerbeek
On Sun, Dec 12, 2021 at 07:15:51PM -0600, Scott Cheloha wrote:

> cat(1) sizes its I/O buffer according to the st_blksize of the first
> file it processes.  We don't do this very often in the tree.  I'm not
> sure if we should trust st_blksize.
> 
> It would be simpler to just choose a value that works in practice and
> always use it.
> 
> 64K works well.  We settled on that for tee(1) recently.

Why are you also change the allocation to be per-file? You might as
well go for a static buffer if you make it fixed size.

-Otto

> 
> Thoughts?
> 
> Index: cat.c
> ===
> RCS file: /cvs/src/bin/cat/cat.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 cat.c
> --- cat.c 24 Oct 2021 21:24:21 -  1.32
> +++ cat.c 13 Dec 2021 00:57:48 -
> @@ -33,9 +33,6 @@
>   * SUCH DAMAGE.
>   */
>  
> -#include 
> -#include 
> -
>  #include 
>  #include 
>  #include 
> @@ -45,7 +42,7 @@
>  #include 
>  #include 
>  
> -#define MAXIMUM(a, b)(((a) > (b)) ? (a) : (b))
> +#define BSIZE (64 * 1024)
>  
>  int bflag, eflag, nflag, sflag, tflag, vflag;
>  int rval;
> @@ -221,26 +218,20 @@ raw_args(char **argv)
>  void
>  raw_cat(int rfd, const char *filename)
>  {
> - int wfd;
> + char *buf;
>   ssize_t nr, nw, off;
> - static size_t bsize;
> - static char *buf = NULL;
> - struct stat sbuf;
> + int wfd;
>  
>   wfd = fileno(stdout);
> - if (buf == NULL) {
> - if (fstat(wfd, ) == -1)
> - err(1, "stdout");
> - bsize = MAXIMUM(sbuf.st_blksize, BUFSIZ);
> - if ((buf = malloc(bsize)) == NULL)
> - err(1, NULL);
> - }
> - while ((nr = read(rfd, buf, bsize)) != -1 && nr != 0) {
> + if ((buf = malloc(BSIZE)) == NULL)
> + err(1, NULL);
> + while ((nr = read(rfd, buf, BSIZE)) != -1 && nr != 0) {
>   for (off = 0; nr; nr -= nw, off += nw) {
>   if ((nw = write(wfd, buf + off, nr)) == -1 || nw == 0)
>   err(1, "stdout");
>   }
>   }
> + free(buf);
>   if (nr == -1) {
>   warn("%s", filename);
>   rval = 1;
> 



cat(1): always use a 64K buffer

2021-12-12 Thread Scott Cheloha
cat(1) sizes its I/O buffer according to the st_blksize of the first
file it processes.  We don't do this very often in the tree.  I'm not
sure if we should trust st_blksize.

It would be simpler to just choose a value that works in practice and
always use it.

64K works well.  We settled on that for tee(1) recently.

Thoughts?

Index: cat.c
===
RCS file: /cvs/src/bin/cat/cat.c,v
retrieving revision 1.32
diff -u -p -r1.32 cat.c
--- cat.c   24 Oct 2021 21:24:21 -  1.32
+++ cat.c   13 Dec 2021 00:57:48 -
@@ -33,9 +33,6 @@
  * SUCH DAMAGE.
  */
 
-#include 
-#include 
-
 #include 
 #include 
 #include 
@@ -45,7 +42,7 @@
 #include 
 #include 
 
-#define MAXIMUM(a, b)  (((a) > (b)) ? (a) : (b))
+#define BSIZE (64 * 1024)
 
 int bflag, eflag, nflag, sflag, tflag, vflag;
 int rval;
@@ -221,26 +218,20 @@ raw_args(char **argv)
 void
 raw_cat(int rfd, const char *filename)
 {
-   int wfd;
+   char *buf;
ssize_t nr, nw, off;
-   static size_t bsize;
-   static char *buf = NULL;
-   struct stat sbuf;
+   int wfd;
 
wfd = fileno(stdout);
-   if (buf == NULL) {
-   if (fstat(wfd, ) == -1)
-   err(1, "stdout");
-   bsize = MAXIMUM(sbuf.st_blksize, BUFSIZ);
-   if ((buf = malloc(bsize)) == NULL)
-   err(1, NULL);
-   }
-   while ((nr = read(rfd, buf, bsize)) != -1 && nr != 0) {
+   if ((buf = malloc(BSIZE)) == NULL)
+   err(1, NULL);
+   while ((nr = read(rfd, buf, BSIZE)) != -1 && nr != 0) {
for (off = 0; nr; nr -= nw, off += nw) {
if ((nw = write(wfd, buf + off, nr)) == -1 || nw == 0)
err(1, "stdout");
}
}
+   free(buf);
if (nr == -1) {
warn("%s", filename);
rval = 1;