If you think you might want to put this function into a package, it would be much better to use gsub instead of passing the job off to an external program, because non-POSIX operating systems (Windows) will be a headache to support. -- Sent from my phone. Please excuse my brevity.
On September 10, 2016 12:23:37 PM PDT, Glenn Schultz <glennmschu...@me.com> wrote: >I have a file that for basically carries three datasets of differing >lengths. To make this a single downloadable file the creator of the >file as used both NUL hex00 and space hex20 to normalize the lengths. > >Below is the function that I am writing. I am using sed to replace the >hex characters. First, to get past NUL I use sed to replace hex 00 >with hex 20. This has worked. Once the Nul is removed and can >successfully parse the file with ReadLine sub_str. This final step >before delimiting the file and making it nice and tidy is to remove the >hex 20 characters. I am using the same strategy to eliminate the >spaces and sed command works in a shell but does not work in the R >function. What am I doing wrong? I have dput - some of the nastier >lines with hex 20 characters below my code. > >Any advice is appreciated. > >Glenn > >arm <- function(filepath){ >callpath <- paste(filepath, "arm.txt", sep ="") >ARMReturn <- paste(filepath, "arm.csv", sep = "") >ARMPoolReturnPath <- paste(filepath,"armatpool.csv", sep = "") >ARMNextChgReturnPath <- paste(filepath,"nexratechangedate.csv", sep = >"") >ARMFirstPmtReturnPath <- paste(filepath,"firstpaymentdate.csv", sep = >"") > ># This file contains NUL hex characters before parsing the file replace ># the hex NUL x00 with space x20 and save as a csv file. Use system >command >sedcommand <- paste("sed -e 's/\\x00/\\x20/g' <", >filepath, "arm.txt", >">", "arm.csv", sep = " ") >system(sedcommand) > ># read the arm quartile data to a file once skipNuls then length of >each ># record set changes and the data map provided by FNMA is no longer >valid ># with respect to the length of each embedded data set >data <- readLines(ARMReturn, encoding = "ascii") > >quartile <- NULL >numchar <- nchar(x = data, type = "chars") >start <- c(seq(1, numchar, 399)) >end <- c(seq(399, numchar, 399)) >quartile <- str_sub(data, start[1:length(start)], end[1:length(end)]) >write(quartile, ARMReturn) > ># The file has been parsed accroding to length 400 for each data >element. ># The next step is to remove all the trailing white space hex character ># x20 > >sedcommand2 <- paste("sed -e '/\\x20/d' <", >filepath, "arm.csv", >">", "arm2.csv", sep = "") >system(sedcommand2) >} # end of function > > >c(" 555556 >WS320021201006125{000378{000348{ > ", >" 555556 >WS320021201006250{000954{000880{ > ", >" 555556 >WS320021201005625{001062{000983{ > ", >" 555556 >WS320030101005250{000027{000025{ > ", >" 555556 >WS320030101006500{000033{000030{ > ", >" 555556 >WS320030101005125{000061{000056{ > ", >" 555556 >WS320030101005375{000095{000088{ > ", >" 555556 >WS320030101005350{000217{000200{ > ", >" 555556 >WS320030101006125{000400{000369{ > ", >" 555556 >WS320030101005310{000439{000406{ > ", >" 555556 >WS320030101006000{000573{000529{ > " > > > > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.