On Wed, 10 Aug 2005, Douglas Bates wrote: > The Harwell-Boeing format for exchanging matrices is one of those > lovely legacy formats that is based on fixed-format Fortran > specifications and 80 character records. (Those of you who don't know > why they would be 80 characters instead of, say, 60 or 100 can ask one > of us old-timers some day and we'll tell you long, boring stories > about working with punched cards.) > > Reading this format would take about 10 lines of R code if it were not > for the fact that it allows things like 40 two-digit integers to be > written as one 80 character record with no separators. This actually > made sense to some people once upon a time. > > I could use read.fwf or, better, use some of the code in the read.fwf > function to extract the strings that should have been separated and > convert them to numeric values but I have been trying to think if > there is a more clever way of doing this. I know the number of > records and the number of elements to read and, if it would help, I > can assemble the records into one long text string. > > Can anyone think of a vectorized way to extract successive substrings > of length k or, perhaps, a way to use regular expressions to insert a > blank after every k characters?
substr(ng) can do that: st <- "1234567890abcdef" lens <- seq(0, nchar(st), 2) substring(st, 1+lens[-length(lens)], lens[-1]) [1] "12" "34" "56" "78" "90" "ab" "cd" "ef" as it is vectorized internally. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel