According to the manual, system() splits output lines into 8096-char chunks; under UNIX, actually seems to return 8094 chars, and drop the 8095th. Spot missing digits in:
x2 <- system("perl -e 'print \"0123456789\"x10000'", intern=T) Looks like a bug in the code to remove newlines at src/unix/sys-unix.c:218 -- fgets() reads size-1 characters and adds null, so strlen(buf)<size always true. Testing for '\n' explicitly is probably better (deals with 8094 chr + \n case) -- it turns out the win32 code already does this anyway. (IIRC the read>0 condition in the win32 code would be redundant but I copied it anyway to be safe.) Anyway, rather trivial diff below. Both manpages should probably say 8095 rather than 8096, I think. Mark <>< Index: library/base/man/unix/system.Rd =================================================================== RCS file: /cvs/R/src/library/base/man/unix/system.Rd,v retrieving revision 1.2 diff -u -r1.2 system.Rd --- library/base/man/unix/system.Rd 2002/12/08 09:50:47 1.2 +++ library/base/man/unix/system.Rd 2004/02/28 15:20:09 @@ -26,7 +26,7 @@ If \code{intern} is \code{TRUE} then \code{popen} is used to invoke the command and the output collected, line by line, into an \R \code{\link{character}} vector which is returned as the value of - \code{system}. Output lines of more that 8096 characters will be split. + \code{system}. Output lines of more that 8095 characters will be split. If \code{intern} is \code{FALSE} then the C function \code{system} is used to invoke the command and the value returned by \code{system} Index: library/base/man/windows/system.Rd =================================================================== RCS file: /cvs/R/src/library/base/man/windows/system.Rd,v retrieving revision 1.15 diff -u -r1.15 system.Rd --- library/base/man/windows/system.Rd 2003/05/08 21:45:54 1.15 +++ library/base/man/windows/system.Rd 2004/02/28 15:20:09 @@ -33,7 +33,7 @@ If \code{intern = TRUE}, a character vector giving the output of the command, one line per character string. If the command could not be run or gives an error a \R error is generated. - (Output ines of more that 8096 characters will be split.) + (Output lines of more that 8095 characters will be split.) If \code{intern = FALSE}, the return value is a error code, given the invisible attribute (so needs to be printed explicitly). If the Index: unix/sys-unix.c =================================================================== RCS file: /cvs/R/src/unix/sys-unix.c,v retrieving revision 1.39 diff -u -r1.39 sys-unix.c --- unix/sys-unix.c 2003/09/10 11:45:29 1.39 +++ unix/sys-unix.c 2004/02/28 15:20:15 @@ -215,7 +215,8 @@ fp = R_popen(CHAR(STRING_ELT(CAR(args), 0)), x); for (i = 0; fgets(buf, INTERN_BUFSIZE, fp); i++) { read = strlen(buf); - if (read < INTERN_BUFSIZE) buf[read - 1] = '\0'; /* chop final CR */ + if (read>0 && buf[read-1] == '\n') + buf[read - 1] = '\0'; /* chop final CR */ tchar = mkChar(buf); UNPROTECT(1); PROTECT(tlist = CONS(tchar, tlist)); ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel