Hi,

I need to resample characters from a dataset that consists of an extremely
long string that is written over hundreds of thousands of lines, each of
length 50 characters.  I am currently doing this by first inserting a space
after each character in the dataset and then using the following commands:

y <- as.matrix(read.table("data.txt"), stringsAsFactors=FALSE)
bstrap <- sample(length(y), 100000, TRUE)
write(y[bstrap], file="Rep1.txt", ncolumns=50, append=FALSE)
bstrap <- sample(length(y), 100000, TRUE)
write(y[bstrap], file="Rep2.txt", ncolumns=50, append=FALSE)
bstrap <- sample(length(y), 100000, TRUE)
.
.
.
and so on for 500 reps.


I think there should be a better way of doing this.  My specific questions:

1. Is there a way to avoid inserting spaces between the characters before
calling the "sample" command (because I don't want spaces between the
resampled characters in the output either; see number 2 below)?

2. If I have no choice but to insert the spaces in my data before
resampling, is there a way to output the resampled data without spaces, but
simply as 50-character long strings one below the other)?  I tried inserting
the following command: strip.white=TRUE in the write command line, but it
gave me an error as it did not understand the command.

3. Finally, since I have to get 500 such resampled reps from each dataset
(and there are over 20 such huge datasets) is there a way around having to
write a separate write command for each rep?

Any suggestions will be greatly appreciated.

Thanks,

S.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to