I'd like to report a bug (buffer overflow?) in the function sub(..., perl = TRUE)
I wanted to implement the familiar perl function for removing white spaces before and after a character string: sub trimwhitespace($) { my $string = shift; $string =~ s/^\s+//; $string =~ s/\s+$//; return $string; } So in R this would (presumably) become: trimwhitespace <- function(x) { x <- sub('^\\s+', '', x, perl = TRUE) ## Removes preceding white spaces x <- sub('\\s+$', '', x, perl = TRUE) ## Removes trailing white spaces x } Expected behavior: > trimwhitespace(" abc") [1] "abc" On Windows: > trimwhitespace(" abc") [1] "abc\0\220\277\036\001\220°ß\08iW\001p±ß\0X°ß\0" ## That's not good! Looks like a buffer overflow On Linux: [1] "abc\0\0\002\0\0 \377\0\0\0\002\0\0\0\006\0\0/\377\0\0" ## Linux goofs as well! Debugging shows that it is the first line in the function that produces the overflow. The overflow seems proportional to the about of preceding white spaces. I'm not sure if this is exploitable or not, but it might be possible to run arbitrary code stored in a character object using this. Hopefully this is helpful, Robert > version # Linux _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 0.1 year 2004 month 11 day 15 language R > version # Windows _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R ## PS Here are the results of another call (x is 1000 spaces) > trimwhitespace(paste(x, "abc")) [1] "abc\0\0\002\0\002\0\001\0\001\004\0\0\0\a\003\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\005class\0\0\006\020\0\0\0\001\0\0\004 \0\0\0\024groupGenericFunction\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\apackage\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\amethods\0\0\0þ\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\fgroupMembers\0\0\004\023\0\0\0\a\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\001+\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\001-\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\001*\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\001^\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\002%%\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\003%/%\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\001/\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\ageneric\0\0\006\020\0\0\0\001\0\0\004 \0\0\0\005Arith\0\0\004\002\0\0\002\377\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\004base\0\0\0þ\0\0\004\002\0\0\002\377\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\004base\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\005group\0\0\004\023\0\0\0\001\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\003Ops\0\0\004\002\0\0! \0\001\0\0\024 \0\0\0\nvalueClass\0\0\0\020\0\0\0\0\0\0\004\002\0\0\0\001\0\0\024 \0\0\0 signature\0\0\004\020\0\0\0\002\0\0\024 \0\0\0\002e1\0\0\024 \0\0\0\002e2\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\adefault\0\0\003\023\0\0\0\0\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\amethods\0\0\0\023\0\0\0\0\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\bargument\0\0\0\001\0\0\024 \0\0\0\002e1\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\nallMethods\0\0\0\023\0\0\0\0\0\0\004\002\0\0\001\377\0\0\006\020\0\0\0\001\0\0\004 \0\0\0\vMethodsList\0\0\004\002\0\0\002\377\0\0\004\020\0\0\0\001\0\0\004 \0\0\0\amethods\0\0\0þ\0\0\0þ\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\bskeleton\0\0\0\006\0\0\0\003\0\0\004\002\0\0\v\377\0\0\0û\0\0\004\002\0\0\0\001\0\0\024 \0\0\0\002e2\0\0\0û\0\0\0þ\0\0\0\006\0\0\0\001\0\0\024 \0\0\0\004stop\0\0\0\002\0\0\004\020\0\0\0\001\0\0\004 \0\0\0>Invalid call in method dispatch to \"Arith\" (no default method)\0\0\0þ\0\0\0\002\0\0\v\377\0\0\0\002\0\0\016\377\0\0\0þ\0\0\0þ\0\0\0÷\0\0\0\0\0\0\0\! 001\0\0\004 \0\0\0\benv::138\0\0\004\002\0\0\v\377\0\0\0û\0\0\! 004\002\ 0\0\016\377\0\0\0û\0\0\0þ\0\0\0\006\0\0\017\377\0\0\0\002\0\0\004\020\0\0" Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396 Fax:617/476-6389 mailto:[EMAIL PROTECTED] This e-mail, and any attachments hereto, are intended for use by the addressee(s) only and may contain information that is (i) confidential information of Geode Capital Management, LLC and/or its affiliates, and/or (ii) proprietary information of Geode Capital Management, LLC and/or its affiliates. If you are not the intended recipient of this e-mail, or if you have otherwise received this e-mail in error, please immediately notify me by telephone (you may call collect), or by e-mail, and please permanently delete the original, any print outs and any copies of the foregoing. Any dissemination, distribution or copying of this e-mail is strictly prohibited. ______________________________________________ R-devel@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-devel