Nevermind. Found a trial and error solution for testing numeric(0)
I'm getting closer to an "alltips.html" output I can enjoy. http://pj.freefaculty.org/R/alltips.html I really think we should consider showing only this main document, or perhaps even just this one http://wiki.r-project.org/rwiki/doku.php?id=tips:tips to visitors on the web site. Only people who want to edit things should ever have to look at the Wiki part. Why? Now there is a tangle of menus. I got lost several times pointing on links and tables of contents and couldn't find my way around. You know if I'm lost, visitors will be lost too. My alltips version has the "fonts too small" problem that affects the Wiki because this just steals the style sheet, and you are fixing that, right? There are some unexpected "tables of contents" that appear throughout. I can't say why, but will find out. And there is some crap at the top and bottom of this document that I'm going to eliminate, I'm not sure how, but you better bet it will be tedious and stupid. Because, this code makes me feel tedious and stupid. Anyway, the internal links work. And here's how I did this: # head of the page wikihead <- c("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"", " \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">", "<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\"", " lang=\"en\" dir=\"ltr\">", "<head>", " <meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />", " <title>All R wiki tips</title>", "", " <link rel=\"stylesheet\" media=\"screen\" type=\"text/css\" ", " href=\"http://wiki.r-project.org/rwiki/lib/exe/css.php\" />", " <link rel=\"stylesheet\" media=\"print\" type=\"text/css\" ", " href=\"http://wiki.r-project.org/rwiki/lib/exe/css.php?print=1\" />", "", "</head>", "<body>") rawimport <- readLines("http://wiki.r-project.org/rwiki/doku.php?id=tips:tips&do=export_raw") htmlout <- "alltips.html" cat( wikihead, file=htmlout, sep="\n") whatlinkstoatip <- grep('\\* \\[\\[', rawimport) majorHeadings <- grep('====', rawimport) whatlinkstoatip <- sort(c(whatlinkstoatip, majorHeadings)) cleanedlines <- gsub( " \\* \\[\\[","", gsub("]]", "" , rawimport[whatlinkstoatip])) # tiplist <- strsplit( gsub(' \\* \\[\\[', '', rawimport[whatlinkstoatip]),'\\|') #tipnames <- unlist(lapply(tiplist, function(x){ # gsub('\\]\\].*','',x[2]) #})) ## tiplinks <- unlist(lapply(tiplist,'[', 1)) # write a table of contents in the # beginning of the file is.numeric0 <- function(x){length(x)==0 & is.numeric(x)} cat('<h1>Table of Contents</h1>\n', file=htmlout, append=TRUE) ##for(i in seq(along=tipnames)){ # cat('<a href="#', tiplinks[i], # '">', tipnames[i],'</a>, \n', # sep="", file=htmlout, append=T) #} #cat('...\n', file=htmlout, append=TRUE) for (item in cleanedlines) { itsaheading <- is.numeric0 (grep("====", item )) if (isTRUE(itsaheading)) { print (item) tipitem <- unlist( strsplit ( item, "\\|" ) ) print(tipitem) tipName <- gsub("\ ","",tipitem[1]) tipTitle <- tipitem[2] dlthis <- paste("http://wiki.r-project.org/rwiki/doku.php?id=", tipName, "&do=export_xhtml",sep="") print(dlthis) xhtml <- readLines(dlthis) content <- ( grep('<body', xhtml)+1) : ( grep('</body', xhtml)+1) cat('<a href="#', tipName,'">', tipTitle,'</a><br>\n', sep="", file=htmlout, append=TRUE) }else{ reviseditem <- sub("====","<H2>",item) reviseditem <- sub("====","</H2>",reviseditem) cat(reviseditem, file=htmlout, append=TRUE, sep="\n") } } for (item in cleanedlines) { itsaheading <- is.numeric0 (grep("====", item )) if (isTRUE(itsaheading)) { print (item) tipitem <- unlist( strsplit ( item, "\\|" ) ) print(tipitem) tipName <- gsub("\ ","",tipitem[1]) tipTitle <- tipitem[2] dlthis <- paste("http://wiki.r-project.org/rwiki/doku.php?id=", tipName, "&do=export_xhtml",sep="") print(dlthis) xhtml <- readLines(dlthis) content <- ( grep('<body', xhtml)+1) : ( grep('</body', xhtml)+1) cat('<a name="', tipName,'" id="', tipName,'>tipTitle</a>\n', sep="", file=htmlout, append=TRUE) cat(xhtml[content], file=htmlout, append=TRUE, sep="\n") }else{ reviseditem <- sub("====","<H2>",item) reviseditem <- sub("====","</H2>",reviseditem) cat(reviseditem, file=htmlout, append=TRUE, sep="\n") } } -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas _______________________________________________ R-sig-wiki mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-wiki
