Hello R-sig-ecology mailing list,
I’m working on a mutlivariate water quality index where the concentration of
parameter i at site j is normalized by calculating the percentile rank of the
value using a much larger reference dataset.
As an example, I have generated a sample dataset of water quality parameters
(df_sample) and a larger reference dataset (df_ref). I’d like to calculate the
percentile rank of each parameter, at each site, using a reference dataset of a
much larger size.
Example data is below. If anyone has a solution that avoids for loops that
would be preferred.
#generate sample data
df_sample <- data.frame(site = letters[1:10], iron = runif(10, min=0, max=1),
nitrate = runif(10, min=0, max=10))
df_sample
#generate reference dataset
df_ref <- data.frame(iron = seq(0, 1, length.out = 1000), nitrate = seq(0, 10,
length.out = 1000))
df_ref
# now would like to calculate percentile rank of iron and nitrate at all sites
(a:j) based on identical columns in df_ref and include as a new column in
df_sample
Many thanks,
|><̮Mâ̬tt͵)o>
[[alternative HTML version deleted]]
_______________________________________________
R-sig-ecology mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology