[R] Scraping info from a web site?

Spencer Graves Wed, 31 Jan 2018 02:40:42 -0800

Hi, All:

What would you suggest one use to read the data on members of theUS Congress and their positions on net neutrality from"https://www.battleforthenet.com/scoreboard"; into R?

I found recommendations for the "rvest" package to "EasilyHarvest (Scrape) Web Pages". I tried the following:



URL <- 'https://www.battleforthenet.com/scoreboard/'
library(rvest)
Bftn <- read_html(URL)
str(Bftn)


List of 2
 $ node:<externalptr>
 $ doc :<externalptr>
 - attr(*, "class")= chr [1:2] "xml_document" "xml_node"


       However, I don't know what to do with <externalptr>.

The "Selectorgadget" vignette with rvest suggested selecting whatI wanted on the web page and pasting that as an argument into"html_node". This led me to try the following:



Bftn_nodes <- html_nodes(Bftn,
    '.psb-unknown , #house, #senate, #senate p')


str(Bftn_nodes)
List of 4
 $ :List of 2
  ..$ node:<externalptr>
  ..$ doc :<externalptr>
  ..- attr(*, "class")= chr "xml_node"
 $ :List of 2
  ..$ node:<externalptr>
  ..$ doc :<externalptr>
  ..- attr(*, "class")= chr "xml_node"
 $ :List of 2
  ..$ node:<externalptr>
  ..$ doc :<externalptr>
  ..- attr(*, "class")= chr "xml_node"
 $ :List of 2
  ..$ node:<externalptr>
  ..$ doc :<externalptr>
  ..- attr(*, "class")= chr "xml_node"
 - attr(*, "class")= chr "xml_nodeset"

This seems like it may be progress, but I'm still confused onwhat to do next. Or maybe I should be using a different package? Orposting this question to someplace else like StackOverflow.com?



      Thanks,
      Spencer Graves

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Scraping info from a web site?

Reply via email to