Re: [R] Download CSV Files from EUROSTAT Website
On 4 Nov 2013 19:30, David Winsemius dwinsem...@comcast.net wrote: Maybe you should use their download facility rather than trying to deparse a complex webpage with lots of special user interaction features: http://appsso.eurostat.ec.europa.eu/nui/setupDownloads.do That web page depends on the user already having been to the previous page to set up a session and so directly downloading a dataset requires setting up cookies and making sure the request has all the right parameters. Looks like a right pain. -- David. On Nov 4, 2013, at 11:03 AM, Lorenzo Isella wrote: Thanks. I had already introduced this minor adjustments in the code, but the real problem (to me) is the information that gets lost: the informative name of the columns, the indicator type and the units. Cheers Lorenzo On Mon, 04 Nov 2013 19:52:51 +0100, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, If you want to get rid of the (bp) stuff, you can use lapply/gsub. Using Jean's code a bit changed, library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mytable - readHTMLTable(mylines, which = 2, asText=TRUE, stringsAsFactors = FALSE) str(mytable) mytable[] - lapply(mytable, function(x) gsub(\\(.*\\), , x)) mytable[] - lapply(mytable, function(x) gsub(,, , x)) mytable[] - lapply(mytable, as.numeric) colnames(mytable) - 2000:2013 Hope this helps, Rui Barradas Em 04-11-2013 09:53, Lorenzo Isella escreveu: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections()mylist - readHTMLTable(mylines, asText=TRUE)mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
This looks as though you need to be a little XML old-school. readHTMLTable is a summary function drawing on: ?htmlTreeParse() turns the table into xml ?xpathApply() and more. #xpathApply(doc, , //td, function(x)xmlValue(x)) breaks each line at the end of a table cell and extracts the value # The //th picks out the table headings without distinction as to whether they are rows or columns Followed by various gsub() and turning it into a matrix (as this comes out with a list of values without columns. I couldn't identify the headings, but the table body is definitely doable. readHTMLTable seems to assume that the column headings are a single row, which isn't always the case. Paul Bivand On 5 November 2013 18:44, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On 4 Nov 2013 19:30, David Winsemius dwinsem...@comcast.net wrote: Maybe you should use their download facility rather than trying to deparse a complex webpage with lots of special user interaction features: http://appsso.eurostat.ec.europa.eu/nui/setupDownloads.do That web page depends on the user already having been to the previous page to set up a session and so directly downloading a dataset requires setting up cookies and making sure the request has all the right parameters. Looks like a right pain. -- David. On Nov 4, 2013, at 11:03 AM, Lorenzo Isella wrote: Thanks. I had already introduced this minor adjustments in the code, but the real problem (to me) is the information that gets lost: the informative name of the columns, the indicator type and the units. Cheers Lorenzo On Mon, 04 Nov 2013 19:52:51 +0100, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, If you want to get rid of the (bp) stuff, you can use lapply/gsub. Using Jean's code a bit changed, library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mytable - readHTMLTable(mylines, which = 2, asText=TRUE, stringsAsFactors = FALSE) str(mytable) mytable[] - lapply(mytable, function(x) gsub(\\(.*\\), , x)) mytable[] - lapply(mytable, function(x) gsub(,, , x)) mytable[] - lapply(mytable, as.numeric) colnames(mytable) - 2000:2013 Hope this helps, Rui Barradas Em 04-11-2013 09:53, Lorenzo Isella escreveu: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections()mylist - readHTMLTable(mylines, asText=TRUE)mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] Download CSV Files from EUROSTAT Website
Lorenzo, You might want to post this is a new question to get some new eyes on it. Or, you could try posting your question to http://stackoverflow.com/. Scraping the web is a common topic for that group. Jean On Mon, Nov 4, 2013 at 3:53 AM, Lorenzo Isella lorenzo.ise...@gmail.comwrote: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mylist - readHTMLTable(mylines, asText=TRUE) mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections()mylist - readHTMLTable(mylines, asText=TRUE) mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
Hi Lorenzo, Perhaps package pxR can help you out. http://cran.at.r-project.org/web/packages/pxR/index.html pxR: PC-Axis with R The pxR package provides a set of functions for reading and writing PC-Axis files, used by different statistical organizations around the globe for data dissemination. Regards, Carlos Ortega. 2013/11/4 Adams, Jean jvad...@usgs.gov Lorenzo, You might want to post this is a new question to get some new eyes on it. Or, you could try posting your question to http://stackoverflow.com/. Scraping the web is a common topic for that group. Jean On Mon, Nov 4, 2013 at 3:53 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mylist - readHTMLTable(mylines, asText=TRUE) mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
Hello, If you want to get rid of the (bp) stuff, you can use lapply/gsub. Using Jean's code a bit changed, library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mytable - readHTMLTable(mylines, which = 2, asText=TRUE, stringsAsFactors = FALSE) str(mytable) mytable[] - lapply(mytable, function(x) gsub(\\(.*\\), , x)) mytable[] - lapply(mytable, function(x) gsub(,, , x)) mytable[] - lapply(mytable, as.numeric) colnames(mytable) - 2000:2013 Hope this helps, Rui Barradas Em 04-11-2013 09:53, Lorenzo Isella escreveu: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections()mylist - readHTMLTable(mylines, asText=TRUE)mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
Thanks. I had already introduced this minor adjustments in the code, but the real problem (to me) is the information that gets lost: the informative name of the columns, the indicator type and the units. Cheers Lorenzo On Mon, 04 Nov 2013 19:52:51 +0100, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, If you want to get rid of the (bp) stuff, you can use lapply/gsub. Using Jean's code a bit changed, library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mytable - readHTMLTable(mylines, which = 2, asText=TRUE, stringsAsFactors = FALSE) str(mytable) mytable[] - lapply(mytable, function(x) gsub(\\(.*\\), , x)) mytable[] - lapply(mytable, function(x) gsub(,, , x)) mytable[] - lapply(mytable, as.numeric) colnames(mytable) - 2000:2013 Hope this helps, Rui Barradas Em 04-11-2013 09:53, Lorenzo Isella escreveu: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections()mylist - readHTMLTable(mylines, asText=TRUE)mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
On Nov 4, 2013, at 11:03 AM, Lorenzo Isella wrote: Thanks. I had already introduced this minor adjustments in the code, but the real problem (to me) is the information that gets lost: the informative name of the columns, the indicator type and the units. Maybe you should use their download facility rather than trying to deparse a complex webpage with lots of special user interaction features: http://appsso.eurostat.ec.europa.eu/nui/setupDownloads.do -- David. Cheers Lorenzo On Mon, 04 Nov 2013 19:52:51 +0100, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, If you want to get rid of the (bp) stuff, you can use lapply/gsub. Using Jean's code a bit changed, library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mytable - readHTMLTable(mylines, which = 2, asText=TRUE, stringsAsFactors = FALSE) str(mytable) mytable[] - lapply(mytable, function(x) gsub(\\(.*\\), , x)) mytable[] - lapply(mytable, function(x) gsub(,, , x)) mytable[] - lapply(mytable, as.numeric) colnames(mytable) - 2000:2013 Hope this helps, Rui Barradas Em 04-11-2013 09:53, Lorenzo Isella escreveu: Hello, And thanks a lot. This is indeed very close to what I need. I am trying to figure out how not to lose the headers and how to avoid downloading labels like (p) together with the numerical data I am interested in. If anyone on the list knows how to make this minor modifications, s/he will make my life much easier. Cheers Lorenzo On Fri, 01 Nov 2013 14:25:49 +0100, Adams, Jean jvad...@usgs.gov wrote: Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections()mylist - readHTMLTable(mylines, asText=TRUE)mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
On Mon, 04 Nov 2013 20:26:46 +0100, David Winsemius dwinsem...@comcast.net wrote: On Nov 4, 2013, at 11:03 AM, Lorenzo Isella wrote: Thanks. I had already introduced this minor adjustments in the code, but the real problem (to me) is the information that gets lost: the informative name of the columns, the indicator type and the units. Maybe you should use their download facility rather than trying to deparse a complex webpage with lots of special user interaction features: http://appsso.eurostat.ec.europa.eu/nui/setupDownloads.do Of course, for a single data set, I agree. In my case, I need to download and analyze several tens of data sets and I need to be able to do this at regular time intervals, hence the need to automate also the download. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download CSV Files from EUROSTAT Website
Lorenzo, I may be able to help you get started. You can use the XML package to grab the information off the internet. library(XML) mylines - readLines(url(http://bit.ly/1coCohq;)) closeAllConnections() mylist - readHTMLTable(mylines, asText=TRUE) mytable - mylist1$xTable However, when I look at the resulting object, mytable, it doesn't have informative row or column headings. Perhaps someone else can figure out how to get that information. Jean On Thu, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.comwrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10 http://bit.ly/HrDTgT However, what I would like to do is to be able to download directly the csv file corresponding to a properly formatted dataset (typically a dynamic dataset) from EUROSTAT. To fix the ideas, please consider the dataset at the following link http://bit.ly/1coCohq what I would like to do is to automatically read its content into R, or at least to automatically download it as a csv file (full extraction, single file, no flags and footnotes) which I can then manipulate easily. Any suggestion is appreciated. Cheers Lorenzo __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.