[EMAIL PROTECTED] writes:

> According to the manual, system() splits output lines into
> 8096-char chunks; under UNIX, actually seems to return 8094
> chars, and drop the 8095th.  Spot missing digits in:
> 
>   x2 <- 
>     system("perl -e 'print \"0123456789\"x10000'",
>     intern=T)
> 
> Looks like a bug in the code to remove newlines at
> src/unix/sys-unix.c:218 -- fgets() reads size-1 characters
> and adds null, so strlen(buf)<size always true.  Testing for
> '\n' explicitly is probably better (deals with 8094 chr + \n
> case) -- it turns out the win32 code already does this
> anyway.  (IIRC the read>0 condition in the win32 code would
> be redundant but I copied it anyway to be safe.)
> 
> Anyway, rather trivial diff below.  Both manpages should
> probably say 8095 rather than 8096, I think.

Confirmed for R-devel too. Thanks for the fix, will apply in due
course. Notice that the same fix handles the case where the final line
is not \n-terminated:

> nchar(x2)
 [1] 8094 8094 8094 8094 8094 8094 8094 8094 8094 8094 8094 8094 2859
> sum(nchar(x2))
[1] 99987
> length(nchar(x2))
[1] 13

I.e. we're losing a character in every block, including the last,
short, one.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED])             FAX: (+45) 35327907

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Reply via email to