Re: [R] scan html: sep = ""

Uwe Ligges Mon, 04 Apr 2005 08:46:30 -0700

Christoph Lehmann wrote:

entry from html:
<tr bgcolor=#9090f0><td align="right"><b>BM</b></td><td> 0.952</td><td> 0.136</td><td> 6.984</td><td>0.000000</td></tr> <tr bgcolor=#9090f0><td align="right"><b>BH</b></td><td> 1.338</td><td> 0.136</td><td> 9.821</td><td>0.000000</td></tr>
 using
left.data<- scan(paste(path, left.file, sep = ""), what = 'character',
               sep=c("<td>", "</td>"))
yields
 > left.data
 [1] "  "                  "tr bgcolor=#9090f0>" "td align=right>"
 [4] "b>BM"                "/b>"                 "/td>"
 [7] "td> 0.952"           "/td>"                "td> 0.136"
[10] "/td>"                "td> 6.984"           "/td>"
[13] "td>0.000000"         "/td>"                "/tr>"
[16] "  "                  "tr bgcolor=#9090f0>" "td align=right>"
[19] "b>BH"                "/b>"                 "/td>"
[22] "td> 1.338"           "/td>"                "td> 0.136"
[25] "/td>"                "td> 9.821"           "/td>"
[28] "td>0.000000"         "/td>"                "/tr>"
why doesn't it detect the whole '<tr> as sep?
Uwe Ligges wrote:
Christoph Lehmann wrote:
Hi I try to import html text and I need to split the fields at each <td> or </td> entry

How can I succeed? sep = '<td>' doens't yield the right result
If it fits pairwise together, use
  sep=c("<td>", "</td>")

Apologies, one should not send untested code. "sep" must be a character rather than a string containg more than one character.

So you may want to try out my second suggestion.

Uwe Ligges

if not, you can read the whole lot with readLines and strsplit for both pattern after that, for example.
Uwe Ligges
thanks for hints
______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] scan html: sep = ""

Reply via email to