On Mon, 2007-08-06 at 21:23 +0100, Prof Brian Ripley wrote: > I am sure Marc knows that ?sub has examples of trimming trailing space and > whitespace in various styles.
Indeed, though leading spaces are not covered there, so thought that I would take a minute or two to provide both and the combination of the two using gsub(). > On Mon, 6 Aug 2007, Marc Schwartz wrote: > > > On Mon, 2007-08-06 at 12:15 -0700, adiamond wrote: > >> I feel like an idiot posting this because every language I've ever seen > >> has a > >> string function that trims blanks off strings (off the front or back or > >> both). > > Some very common languages do not, though. It is an exercise in Kernighan > & Ritchie (the original C reference), and an FAQ entry for Perl. > > >> Ideally, it would process whole data frames/matrices etc but I don't > >> even see one that processes a single string. But I've searched and I don't > >> even see that. There's a strtrim function but it does something completely > >> different. > > > > If you want to do this while initially importing the data into R using > > one of the read.table() family of functions, see the 'strip.white' > > argument in ?read.table, which would do an entire data frame in one > > call. > > > > Otherwise, the easiest way to do it would be to use sub() or gsub() > > along the lines of the following: > > > > # Strip leading space > > sub("^ +", "", YourTextVector) > > > > > > # Strip trailing space > > sub(" +$", "", YourTextVector) > > > > > > # Strip both > > gsub("(^ +)|( +$)", "", YourTextVector) > > > > > > > > > > Examples of use: > > > >> sub("^ +", "", " Leading Space") > > [1] "Leading Space" > > > > > >> sub(" +$", "", "Trailing Space ") > > [1] "Trailing Space" > > > > > >> gsub("(^ +)|( +$)", "", " Leading and Trailing Space ") > > [1] "Leading and Trailing Space" > > > > > > See ?sub which also has ?gsub > > > > Note that the above will only strip spaces, not all white space. > > > > You can then use the appropriate call in one of the *apply() family of > > functions to loop over columns/rows as may be appropriate. > > Well, arrays are vectors and so can be done by > > A[] <- sub(....., A) > > and data frames with character columns by > > A[] <- lapply(A, function(x) sub(....., x)) Right. One could probably use it on mixed data frames along the lines of the following (untested): A[] <- lapply(A, function(x) ifelse(is.character(x) | is.factor(x), sub(....., x), x)) And leave out the "| is.factor(x)" if one only wanted character columns affected. Thanks, Marc ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.