# Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

```sapply goes element by element in your list, where each element is one of your
dataframes. So mydata starts out as dataframe1, then dataframe2, then
dataframe3, etc. It is never all of them at once. It goes through the list
sequentially. So, at the end of the sapply call, you have a vector of length 10
where the first element corresponds to the rate closest to x=1 in dataframe 1,
and the tenth element corresponds to the rate closest to x=1 in dataframe 10.
If your columns are not named x and y, then the function should be edited
accordingly based on the names. It does assume the "x" and "y" have the same
name across dataframes. For example, if x was actually "Time" and y was "Rate",
you could use```
```
#Generate data
set.seed(5)
for (i in 1:10) {
assign(x = paste0("df", i),
value = data.frame(Time = sort(rnorm(n = 10, mean = 1, sd = 0.1)),
Rate= rnorm(n = 10, mean = 30, sd = 1)))
} # Create 10 Data Frames

# Define Functions (two versions based on how you want to deal with ties)
ExtractFirstMin<- function(df){
df\$abs_diff<- abs(df\$Time-1)
min_rate<- df\$Rate[which.min(df\$abs_diff)]
return(min_rate)
}

# Put all dataframes into a list
df_list<- list(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

# Apply function across list
sapply(df_list, ExtractFirstMin)
________________________________
From: rain1...@aim.com <rain1...@aim.com>
Sent: Tuesday, May 16, 2023 12:46 PM
To: Alexander Ilich <ail...@usf.edu>; r-sig-geo@r-project.org
<r-sig-geo@r-project.org>
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

Hi Alexander,

Wow, thank you so very much for taking the time to articulate this answer! It
really gives a good understanding of what is going on at each stage in the
coding!

And sorry if I missed this previously, but the object "mydata" is defined based
on the incorporation of all dataframes? Since it is designed to swiftly obtain
the first minimum at y = ~1 across each dataframe, "mydata" must take into
account "dataframe1" to dataframe10", correct?

Also, the "x" is simply replaced with the name of the x-column and the "y" with
the y-column name, if I understand correctly?

Again, sorry if I overlooked this, but that would be all, and thank you so very
much, once again for your help and time with this! Much appreciated!

~Trav.~

-----Original Message-----
From: Alexander Ilich <ail...@usf.edu>
To: r-sig-geo@r-project.org <r-sig-geo@r-project.org>; rain1...@aim.com
<rain1...@aim.com>
Sent: Tue, May 16, 2023 11:42 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

The only spot you'll need to change the names for is when putting all of your
dataframes in a list as that is based on the names you gave them in your script
when reading in the data. In the function, you don't need to change the input
to "dataframe1", and naming it that way could be confusing since you are
applying the function to more than just dataframe1 (you're applying it to all
10 of your dataframes). I named the argument df to indicate that you should
supply your dataframe as the input to the function, but you could name it
anything you want. For example, you could call it "mydata" and define the
function this way if you wanted to.

ExtractFirstMin<- function(mydata){
mydata\$abs_diff<- abs(mydata\$x-1)
min_rate<- mydata\$y[which.min(mydata\$abs_diff)]
return(min_rate)
}

#The function has its own environment of variables that is separate from the
global environment of variables you've defined in your script.
#When we supply one of your dataframes to the function, we are assigning that
information to a variable in the function's environment called "mydata".
Functions allow you to generalize your code so that you're not required to name
your variables a certain way. Note here, we do assume that "mydata" has a "\$x"
and "\$y" slot though.

#Without generalizing the code using a function, we'd need to copy and paste
the code over and over again and make sure to change the name of the dataframe
each time. This is very time consuming and error prone. Here's an example for
the first 3 dataframes.

min_rate<- rep(NA_real_, 10) #initialize empty vector
df1\$abs_diff<- abs(df1\$x-1)
min_rate[1]<- df1\$y[which.min(df1\$abs_diff)]

df2\$abs_diff<- abs(df2\$x-1)
min_rate[2]<- df2\$y[which.min(df2\$abs_diff)]

df3\$abs_diff<- abs(df3\$x-1)
min_rate[3]<- df3\$y[which.min(df3\$abs_diff)]

print(min_rate)
#>  [1] 29.40269 32.21546 30.75330       NA       NA       NA       NA       NA
#>  [9]       NA       NA

#With the function defined we can run that it for each individual dataframe,
which is less error prone than copying and pasting but still fairly repetitive
ExtractFirstMin(mydata = df1) # You can explicitly say "mydata ="
#> [1] 29.40269
ExtractFirstMin(df2) # Or equivalently it will be based on the order arguments
when you defined the function. Since there is just one argument, then what you
supply is assigned to "mydata"
#> [1] 32.21546
ExtractFirstMin(df3)
#> [1] 30.7533

# Rather than manually typing out to tun the function on eeach dataframe and
bringing it together, we can instead use sapply.
# Sapply takes a list of inputs and a function as arguments. It then applies
the function to every element in the list and returns a vector (i.e. goes
through each dataframe in your list, applies the function to each one
individually, and then records the result for each one in a single variable).
sapply(df_list, ExtractFirstMin)
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907

________________________________
From: rain1...@aim.com <rain1...@aim.com>
Sent: Monday, May 15, 2023 4:44 PM
To: Alexander Ilich <ail...@usf.edu>; r-sig-geo@r-project.org
<r-sig-geo@r-project.org>
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

Hi Alexander and everyone,

I hope that all is well! Just to follow up with this, I recently was able to
try the following code that you had kindly previously shared:

ExtractFirstMin<- function(df){
df\$abs_diff<- abs(df\$x-1)
min_rate<- df\$y[which.min(df\$abs_diff)]
return(min_rate)
} #Get first y value of closest to x=1

Just to be clear, do I simply replace the "df" in that code with the name of my
individual dataframes? For example, here is the name of my 10 dataframes, which
are successfully placed in a list (i.e. df_list), as you showed previously:

dataframe1
dataframe2
dataframe3
dataframe4
dataframe5
dataframe6
dataframe7
dataframe8
dataframe9
dataframe10

Thus, using your example above, using the first dataframe listed there, would
this become:

ExtractFirstMin<- function(dataframe1){
dataframe1\$abs_diff<- abs(dataframe1\$x-1)
min_rate<- dataframe1\$y[which.min(dataframe1\$abs_diff)]
return(min_rate)
} #Get first y value of closest to x=1

df_list<- list(dataframe1, dataframe2, dataframe3, dataframe4, dataframe5,
dataframe6, dataframe7, dataframe8, dataframe9, dataframe10)

# Apply function across list
sapply(df_list, ExtractFirstMin)

Am I doing this correctly?

Thanks, again!

-----Original Message-----
From: Alexander Ilich <ail...@usf.edu>
To: rain1...@aim.com <rain1...@aim.com>
Sent: Thu, May 11, 2023 1:48 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

Sure thing. Glad I could help!
________________________________
From: rain1...@aim.com <rain1...@aim.com>
Sent: Thursday, May 11, 2023 12:17:12 AM
To: Alexander Ilich <ail...@usf.edu>
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

Hi Alexander,

Many thanks for sharing this! It was really helpful!

-----Original Message-----
From: Alexander Ilich <ail...@usf.edu>
To: rain1...@aim.com <rain1...@aim.com>
Sent: Wed, May 10, 2023 2:05 pm
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

One way to do this would be to put all your dataframes in a list, make one of
the code implementation I put earlier into a function, and then use sapply to
apply it across all the data frames.

#Generate data
set.seed(5)
for (i in 1:10) {
assign(x = paste0("df", i),
value = data.frame(x = sort(rnorm(n = 10, mean = 1, sd = 0.1)),
y= rnorm(n = 10, mean = 30, sd = 1)))
} # Create 10 Data Frames

# Define Functions (two versions based on how you want to deal with ties)
ExtractFirstMin<- function(df){
df\$abs_diff<- abs(df\$x-1)
min_rate<- df\$y[which.min(df\$abs_diff)]
return(min_rate)
} #Get first y value of closest to x=1

ExtractAvgMin<- function(df){
df\$abs_diff<- abs(df\$x-1)
min_rate<- mean(df\$y[df\$abs_diff==min(df\$abs_diff)])
return(min_rate)
} #Average all y values that are closest to x=1

# Put all dataframes into a list
df_list<- list(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

# Apply function across list
sapply(df_list, ExtractFirstMin)
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907

sapply(df_list, ExtractAvgMin)
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907
________________________________
From: rain1...@aim.com <rain1...@aim.com>
Sent: Wednesday, May 10, 2023 1:40 PM
To: Alexander Ilich <ail...@usf.edu>; r-sig-geo@r-project.org
<r-sig-geo@r-project.org>
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

Hi Alexander,

Thank you so much for taking the time to outline these suggestions!

What if I wanted to only isolate the y-value at x = 1.0 across all of my 10
dataframes? That way, I could quickly see what the highest and lowest y-value
is at x = 1.0? That said, in reality, not all x values are precisely 1.0 (it
can be something like 0.99 to 1.02), but the idea is to target the y-value at x
= ~1.0. Is that at all possible?

Thanks, again!

-----Original Message-----
From: Alexander Ilich <ail...@usf.edu>
To: r-sig-geo@r-project.org <r-sig-geo@r-project.org>; rain1...@aim.com
<rain1...@aim.com>
Sent: Wed, May 10, 2023 10:31 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

So using your data but removing x=1, 0.8 and 1.2 would be equally close. Two
potential options are to choose the y value corresponding to the first minimum
difference (in this case x=0.8, y=39), or average the y values for all that are
equally close (in this case average the y values for x=0.8 and x=1.2). I think
the easiest wayodo that would to first calculate a column of the absolute value
of differences between x and 1 and then subset the dataframe to the minimum of
that column to extract the y values. Here's a base R and tidyverse
implementation to do that.

#Base R
df<- data.frame(x=c(0,0.2,0.4,0.6,0.8,1.2,1.4),
y= c(0,27,31,32,39,34,25))
df\$abs_diff<- abs(df\$x-1)

df\$y[which.min(df\$abs_diff)] #Get first y value of closest to x=1
#> [1] 39
mean(df\$y[df\$abs_diff==min(df\$abs_diff)]) #Average all y values that are
closest to x=1
#> [1] 36.5

#tidyverse
rm(list=ls())
library(dplyr)

df<- data.frame(x=c(0,0.2,0.4,0.6,0.8,1.2,1.4),
y= c(0,27,31,32,39,34,25))
df<- df %>% mutate(abs_diff = abs(x-1))

df %>% filter(abs_diff==min(abs_diff)) %>% pull(y) %>% head(1) #Get first y
value of closest to x=1
#> [1] 39

df %>% filter(abs_diff==min(abs_diff)) %>% pull(y) %>% mean() #Average all y
values that are closest to x=1
#> [1] 36.5
________________________________
From: rain1...@aim.com <rain1...@aim.com>
Sent: Wednesday, May 10, 2023 8:13 AM
To: Alexander Ilich <ail...@usf.edu>; r-sig-geo@r-project.org
<r-sig-geo@r-project.org>
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

Hi Alex and everyone,

My apologies for the confusion and this double message (I just noticed that the
example dataset appeared distorted)! Let me try to simplify here again.

My dataframes are structured in the following way: an x column and y column,
like this:

[X]

Now, let's say that I want to determine the rate of increase at about x = 1.0,
relative to the beginning of the period (i.e. 0 at the beginning). We can see
clearly here that the answer would be y = 43. My question is would it be
possible to quickly determine the value at around x = 1.0 across the 10
dataframes that I have like this without having to manually check them? The
idea is to determine the range of values for y at around x = 1.0 across all
dataframes. Note that it's not perfectly x = 1.0 in all dataframes - some could
be 0.99 or 1.01.

I hope that this is clearer!

Thanks,

-----Original Message-----
From: Alexander Ilich <ail...@usf.edu>
To: r-sig-geo@r-project.org <r-sig-geo@r-project.org>; rain1...@aim.com
<rain1...@aim.com>
Sent: Tue, May 9, 2023 2:23 pm
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

I'm currently having a bit of difficultly following. Rather than using your
actual data, perhaps you could include code to generate a smaller dataset with
the same structure with clear definitions of what is contained within each (r
faq - How to make a great R reproducible example - Stack
Overflow<https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example>).
You can design that dataset to be small with a known answer and the describe
how you got to that answer and then others could help determine some code to

Best Regards,
Alex
________________________________
From: R-sig-Geo <r-sig-geo-boun...@r-project.org> on behalf of rain1290--- via
R-sig-Geo <r-sig-geo@r-project.org>
Sent: Tuesday, May 9, 2023 1:01 PM
To: r-sig-geo@r-project.org <r-sig-geo@r-project.org>
Subject: [R-sig-Geo] Finding the highest and lowest rates of increase at
specific x value across several time series in R

I would like to attempt to determine the difference between the highest and
lowest rates of increase across a series of dataframes at a specified x value.
As shown below, the dataframes have basic x and y columns, with emissions
values in the x column, and precipitation values in the y column. Among the
dataframes, the idea would be to determine the highest and lowest rates of
precipitation increase at "approximately" 1 Terratons of emissions (TtC)
relative to the first value of each time series. For example, I want to figure
out which dataframe has the highest increase at 1 TtC, and which dataframe has
the lowest increase at 1 TtC. at However, I am not sure if there is a way to
quickly achieve this? Here are the dataframes that I created, followed by an
example of how each dataframe is structured:
#Dataframe objects created:
CanESMRCP8.5PL<-data.frame(get3.teratons, pland20)
IPSLLRRCP8.5PL<-data.frame(get6.teratons, pland21)
IPSLMRRCP8.5PL<-data.frame(get9.teratons, pland22)
IPSLLRBRCP8.5PL<-data.frame(get12.teratons, pland23)
MIROCRCP8.5PL<-data.frame(get15.teratons, pland24)
MPILRRCP8.5PL<-data.frame(get21.teratons, pland26)
GFDLGRCP8.5PL<-data.frame(get27.teratons, pland27)
GFDLMRCP8.5PL<-data.frame(get30.teratons, pland28)
#Example of what each of these look like:
>CanESMRCP8.5PL
get3.teratons   pland20    X1      0.4542249 13.252426    X2
0.4626662  3.766658    X3      0.4715780  2.220986    X4      0.4809204
8.495072    X5      0.4901427 10.206458    X6      0.4993126 10.942797    X7
0.5088599  6.592956    X8      0.5187588  2.435796    X9      0.5286758
2.275836    X10     0.5389284  5.051706    X11     0.5496212  8.313389    X12
0.5600628  9.007722    X13     0.5708608 11.905644    X14     0.5819234
6.126022    X15     0.5926283  9.883264    X16     0.6042306  7.699696    X17
0.6159752  5.614193    X18     0.6274483  6.681527    X19     0.6394011
10.112812    X20     0.6519496  8.721810    X21     0.6646344 10.315931    X22
0.6773436 11.372490    X23     0.6903203  8.662169    X24     0.7036479
10.106109    X25     0.7180955 10.990867    X26     0.7322746 13.491778    X27
0.7459771 17.256650    X28     0.7604589 12.040960    X29     0.7753096
10.638796    X30     0.7898374  7.889500    X31     0.8047258 11.757174    X3
2     0.8204160 15.060151    X33     0.8359387  9.822078    X34     0.8510721
11.388695    X35     0.8661237 10.271567    X36     0.8815913 13.224285    X37
0.8984146 15.584782    X38     0.9154501  9.320024    X39     0.9324529
9.187128    X40     0.9497379 12.919805    X41     0.9672824 15.190318    X42
0.9854439 12.098606    X43     1.0041460 16.758629    X44     1.0241779
17.435182    X45     1.0451656 15.323428    X46     1.0663605 18.292109    X47
1.0868977 12.625429    X48     1.1079376 17.318583    X49     1.1295719
14.056624    X50     1.1516720 18.239445    X51     1.1736696 16.312087    X52
1.1963065 18.683315    X53     1.2195753 20.364835    X54     1.2425277
14.337167    X55     1.2653873 16.072449    X56     1.2888002 14.870248    X57
1.3126799 18.431717    X58     1.3362459 19.873449    X59     1.3593610
17.278361    X60     1.3833589 18.532887    X61     1.4083234 16.178170    X62
1.4328881 17.689810    X63     1.4572568 21.395131    X64
1.4821021 20.154886    X65     1.5072721 15.655971    X66     1.5325393
21.692028    X67     1.5581797 23.258303    X68     1.5842384 23.802459    X69
1.6108635 15.824673    X70     1.6365393 19.016228    X71     1.6618322
20.957593    X72     1.6876948 19.105363    X73     1.7134712 19.759288    X74
1.7392598 27.315595    X75     1.7652725 24.882263    X76     1.7913807
25.813408    X77     1.8173818 23.658997    X78     1.8434211 24.223432    X79
1.8695911 23.560818    X80     1.8960611 28.057708    X81     1.9228969
26.996265    X82     1.9493552 26.659719    X83     1.9759324 22.723687    X84
2.0026666 30.977267    X85     2.0290137 29.384326    X86     2.0549359
24.840383    X87     2.0811679 26.952620    X88     2.1081763 29.894790    X89
2.1349227 25.224040    X90     2.1613017 27.722623
>IPSLLRRCP8.5PL
get6.teratons   pland21    X1      0.5300411  8.128827    X2
0.5401701  6.683660    X3      0.5503503 12.344974    X4      0.5607762
11.322411    X5      0.5714146 14.250646    X6      0.5825357 10.013592    X7
0.5937966  9.437394    X8      0.6051673  8.138396    X9      0.6168960
9.767765    X10     0.6290367  8.166579    X11     0.6413864 12.307348    X12
0.6539184 12.623931    X13     0.6667360 11.182448    X14     0.6800060
12.585040    X15     0.6935350 13.408614    X16     0.7071757  9.352335    X17
0.7211951 12.743725    X18     0.7356089 11.625612    X19     0.7502665
10.240418    X20     0.7650959 12.394282    X21     0.7800845 16.963066    X22
0.7953119 16.380090    X23     0.8107459 10.510501    X24     0.8260236
12.645911    X25     0.8414439 14.134851    X26     0.8572960 18.924963    X27
0.8732313 17.849050    X28     0.8892344 10.941533    X29     0.9057380
12.034925    X30     0.9223530 15.897904    X31     0.9391578 19.707692    X3
2     0.9563358 16.690375    X33     0.9738711 18.098571    X34     0.9916517
16.588447    X35     1.0096934 16.125172    X36     1.0279473 19.108647    X37
1.0463864 16.972994    X38     1.0653421 22.869403    X39     1.0842487
21.228874    X40     1.1035309 25.509754    X41     1.1230403 15.579367    X42
1.1426743 21.259726    X43     1.1626806 26.061262    X44     1.1833831
21.918530    X45     1.2045888 22.369094    X46     1.2262981 21.480456    X47
1.2481395 20.503543    X48     1.2703019 27.717028    X49     1.2929382
26.295449    X50     1.3157745 28.271455    X51     1.3390449 31.595651    X52
1.3626052 26.188018    X53     1.3863833 26.326999    X54     1.4102701
26.902272    X55     1.4343871 25.308764    X56     1.4584666 23.789699    X57
1.4831504 26.916504    X58     1.5080384 32.921638    X59     1.5331210
29.753267    X60     1.5582794 29.567720    X61     1.5832585 31.454097    X62
1.6085002 26.602191    X63     1.6339502 35.873728    X64
1.6594560 34.222654    X65     1.6851070 36.290959    X66     1.7109757
31.623912    X67     1.7368503 31.965520    X68     1.7626750 41.490310    X69
1.7883216 35.645934    X70     1.8141292 35.639422    X71     1.8405670
37.085608    X72     1.8672313 44.812777    X73     1.8939987 40.044602    X74
1.9208222 37.834526    X75     1.9478806 44.497335    X76     1.9750195
39.839740    X77     2.0024118 38.300529    X78     2.0302205 52.116649    X79
2.0581589 59.189047    X80     2.0861536 51.559857    X81     2.1141780
43.305779    X82     2.1421791 47.950074    X83     2.1703249 46.252149    X84
2.1985953 47.536605    X85     2.2266540 49.422466    X86     2.2547762
44.577399    X87     2.2827062 49.720523    X88     2.3102098 47.138244    X89
2.3379090 51.882832    X90     2.3656370 51.413472
Etc...
Any help with this would be greatly appreciated!
Thanks,
[[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the
sender and know the content is safe.
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the
sender and know the content is safe.
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the
sender and know the content is safe.
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the
sender and know the content is safe.
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the
sender and know the content is safe.

[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the
sender and know the content is safe.

[[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
```