On 10/25/2021 7:09 AM, Philip Monk wrote:
Hello,

First post - apologies if I get anything wrong - either in describing the
question (I've only been using R for a week) or etiquette.

I have CSV files of Land Surface Temperature (LST) data derived from
Landsat 8 data and exported from Google Earth Engine.  I am investigating
whether the construction of utility-scale solar power plants affects the
local climate.

I need to tidy the CSV files so that I can use Two-way ANOVA w/repeated
measures but am having problems due to column headers (necessarily, I
think) being dates.

Each CSV currently has the following columns:

Buffer
Values 100-2000 in 100 increments.  Buffers are 100m wide and extend
outwards from each site boundary.

24 columns of monthly data.
Column headers are in date format (currently dd/mm/yyyy in Excel) and
relate to the date on which the original Landsat 8 image from which the LST
data are derived was captured.
I need these dates to calculate the 'Julian day' (1-365.25) for each month,
and also to extract the Year.

Time
Currently 1 = pre-construction and 2 = post-construction.

The data frame created when importing one of these CSV's into R looks like
this:

'data.frame':   20 obs. of  14 variables:
  $ Buffer     : int  100 200 300 400 500 600 700 800 900 1000 ...
  $ X15.01.2010: num  6.09 5.27 4.45 3.39 2.9 ...
  $ X16.02.2010: num  6.41 5.99 5.61 4.78 4.31 ...
  $ X20.03.2010: num  8.93 7.38 6.12 5.61 5.61 ...
  $ X24.04.2011: num  6.28 5.81 5.15 4.54 4.32 ...
  $ X07.05.2010: num  6.13 5.54 5.35 4.82 4.52 ...
  $ X08.06.2010: num  7.71 7.4 6.82 6.14 5.82 ...
  $ X13.07.2011: num  4.07 2.93 2.69 2.47 2.53 ...
  $ X11.08.2010: num  5.96 5.68 5.38 4.96 4.57 ...
  $ X12.09.2010: num  5.76 5.15 4.54 3.87 3.46 ...
  $ X17.10.2011: num  3.16 2.51 2.51 2.06 2.01 ...
  $ X15.11.2010: num  4.72 3.77 3.24 2.74 2.49 ...
  $ X01.12.2010: num  4.26 3.516 2.154 1.056 0.315 ...
  $ Time       : int  1 1 1 1 1 1 1 1 1 1 ...


Importing a CSV into R that has a date as a column header (in whatever
format) causes problems!  R adds the 'X', and converts the separator.

I was using 'gather' and 'pivot_longer' (see below) but the date issue has
wrecked that approach.  I've tried reformating the date, trying to remove
the X, and going away to learn more about data frames, dplyr, and readr.
I'm not making any progress, though, and I'm just getting more confused.

Helped requested
~~~~~~~~~~~~~~

How should I proceed to tidy the data such that I can:

*) extract the year and Julian day for each date, then convert the date to
the name of the month?
*) create a tidy table with columns for Buffer, Month, Year, Julian day,
LST (the values), and Time (1 = pre-construction, 2 = post-construction of
a solar farm).

Prior to deciding I needed to calculate the Julian day for use in ANOVA I
was doing this (with month names rather than dates - please remember I'm a
newbie!):

data <- read.csv(...
attach(data)
# data_long <- data %>% pivot_longer(!Buffer, names_to = "month", values_to
= "LST")
# data_long <- data %>% pivot_longer(!Buffer, names_to = c("month",
"Time"), names_sep = 13, values_to = "LST")
data_long <- gather(data, Month, LST, January:December, factor_key=TRUE)
data_long$Time <- as.factor(data$Time)
str(data_long)

'pivot_longer' didn't work, but 'gather' did to create the long data needed
for ANOVA.

For example:

'data.frame': 480 obs. of  4 variables:
  $ Buffer: int  100 200 300 400 500 600 700 800 900 1000 ...
  $ Time  : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
  $ Month : Factor w/ 12 levels "January","February",..: 1 1 1 1 1 1 1 1 1 1
...
  $ LST   : num  NA 0.803 0.803 1.044 0.475 ...

Suggestions/hints/solutions would be most welcome.  :)

Thanks for your time,

Philip

Part-time PhD Student (Environmental Science)
Lancaster University, UK.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Using the data that you read in, you can use "pivot_longer", and various "date" functions to get what you want. For example

# pivot your data
data_long <- data %>% pivot_longer(starts_with("X"), names_to = "chr_date", values_to = "LST")

# now you can use various data functions to get your month, day, and year
# for example
data_long$month <- month(as.Date(data_long$chr_date,"X%d.%m.%Y"))

You may want to read up on the various date functions built in to R such as as.POSIXct, as.POSIXlt, as.Date, and maybe look at the contributed package, lubridate.

Hope this is helpful,

Dan

--
Daniel Nordlund
Port Townsend, WA  USA


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to