On 07.05.2020 at 13:12 Deepayan Sarkar wrote:
On Thu, May 7, 2020 at 4:16 PM Thomas Petzoldt <t...@simecol.de> wrote:
On 07.05.2020 at 11:19 Deepayan Sarkar wrote:
On Thu, May 7, 2020 at 12:58 AM Thomas Petzoldt <t...@simecol.de> wrote:
Sorry if I'm joining a little bit late.

I've put some related links and scripts together a few weeks ago. Then I
stopped with this, because there is so much.

The data format employed by John Hopkins CSSE was sort of a big surprise
to me.
Why? I find it quite convenient to drop the first few columns and
extract the data as a matrix (using data.matrix()).

-Deepayan
Many thanks for the hint to use data.matrix

My aim was not to say that it is difficult, especially as R has all the
tools for data mangling.

My surprise was that "wide tables" and non-ISO dates as column names are
not the "data base way" that we in general teach to our students
Well, I am all for long format data when it makes sense, but I would
disagree that that is always the "right approach". In the case of
regular multiple time series, as in this context, a matrix-like
structure seems much more natural (and nicely handled by ts() in R),
and I wouldn't even bother reshaping the data in the first place.

See, for example,

https://github.com/deepayan/deepayan.github.io/blob/master/covid-19/deaths.rmd

and

https://deepayan.github.io/covid-19/deaths.html

-Deepayan

Great, thank you for the link with the comprehensive lattice graphs and the explanations. I like your package very much and use it often, since it appeared on CRAN (3 of my CRAN packages depend on it). As "dynamic modeller", I consider time always as the first column, but I agree on the other hand, that long tables are often, but not always the right approach, let's think about gridded multi-dimensional netcdf data.

Many thanks for sharing your analysis publicly, I'll add your repo to my link list.

Thomas

With reshape2::melt or tidyr::gather resp. pivot_longer, conversion is
quite easy, regardless if one wants to use tidyverse or not, see example
below.

Again, thanks, Thomas


library("dplyr")
library("readr")
library("tidyr")

file <-
"https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv";

dat <- read_delim(file, delim=",")
names(dat)[1:2] <- c("Province_State", "Country_Region")
dat2 <-
    dat %>%
    ## summarize Country/Region duplicates
    group_by(Country_Region) %>% summarise_at(vars(-(1:4)), sum) %>%
    ## make it a long table
    pivot_longer(cols = -Country_Region, names_to = "time") %>%
    ## convert to ISO 8601 date
    mutate(time = as.POSIXct(time, format="%m/%e/%y"))



An opposite approach was taken in Germany, that organized it as a
big JSON trees.

Fortunately, both can be "tidied" with R, and represent good didactic
examples for our students.

Here yet another repo linking to the data:

https://github.com/tpetzoldt/covid


Thomas


On 04.05.2020 at 20:48 James Spottiswoode wrote:
Sure. COVID-19 Data Repository by the Center for Systems Science and 
Engineering (CSSE) at Johns Hopkins University is available here:

https://github.com/CSSEGISandData/COVID-19

All in csv fiormat.


On May 4, 2020, at 11:31 AM, Bernard McGarvey <mcgarvey.bern...@comcast.net> 
wrote:

Just curious does anyone know of a website that has data available in a format 
that R can download and analyze?

Thanks


Bernard McGarvey


Director, Fort Myers Beach Lions Foundation, Inc.


Retired (Lilly Engineering Fellow).

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

James Spottiswoode
Applied Mathematics & Statistics
(310) 270 6220
jamesspottiswoode Skype
ja...@jsasoc.com

--
Dr. Thomas Petzoldt
senior scientist

Technische Universitaet Dresden
Faculty of Environmental Sciences
Institute of Hydrobiology
01062 Dresden, Germany

https://tu-dresden.de/Members/thomas.petzoldt

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to