Dear list members,
I am trying to generate uncertainty maps of available water capacity predicted with a pedotransfer function via Monte-Carlo simulation. I have the raster layers with the mean and SD of the input variables, so I can sample from their distributions (I ignore the correlations among variables, and consider their distributions as univariate in this exercise). All I want is to apply Monte-Carlo simulation for each cell of the raster. I found very useful some of the previous postings on this subject with the mc2d package ( http://r-sig-geo.2731867.n2.nabble.com/parallel-raster-processing-with-calc-and-mc2d-monte-carlo-simulation-td7587901.html#a7587902). Finally, I found that it was quicker to use the package lhs, generate a latin hypercube sample from a uniform distribution, and then obtain the sample for each of my variables with qnorm. I give an example of my function bellow. The PTF is a bit long; you can ignore most of it and just look at the beginning, where I sample from the distributions, and at the end (the output layers with the prediction interval limits). Applied to a small raster it is pretty quick. The problem is that I have to apply the function to 300 raster stacks of 2805374 cells each, and that is going to take a long time even with parallel computing using clusterR (a couple of weeks with a computer like mine). I imagine some of you have dealt with similar problems. Do you have any advice for sampling from probability distributions in monte-carlo simulations from big raster files? Or any advice for generating uncertainty maps? I believe much of the time is due to accessing the data (getValues). So I tried to do an alternative function, by extracting the data of the raster stack first into a dataframe, but that did not seem to save time either. Thank you in advance for all your help. Kind regards, Mercedes Roman require(raster) require(sp) require(rgdal) library(lhs) library(parallel) library(doParallel) library(snow) ### Generate example rasters r <- raster(nrow=10, ncol=10) set.seed(23850) s1 <- setValues(r,runif(n = 100, min = 0, max=1000)) s2 <- setValues(r,runif(n = 100, min = 0, max=100)) s3 <- setValues(r,runif(n = 100, min = 0, max=1000)) s4 <- setValues(r,runif(n = 100, min = 0, max=20)) s5 <- setValues(r,runif(n = 100, min = 0, max=300)) s6 <- setValues(r,runif(n = 100, min = 0, max=20)) s7 <- setValues(r,runif(n = 100, min = 0, max=200)) s8 <- setValues(r,runif(n = 100, min = 0, max=20)) s9 <- setValues(r,runif(n = 100, min = 0, max=14)) s10 <- setValues(r,runif(n = 100, min = 0, max=1)) s<-stack(s1, s2,s3,s4,s5,s6,s7,s8,s9,s10) plot(s) # Write the PTF function, sample with package lhs func_lhs <- function(x){ ###How many simulations we want? simulations <- 1000 sLHS <-randomLHS(n=simulations,k=5) ### Draws a Latin Hypercube Sample from a set of uniform distribution ### Transform the values into normal distribution with the mean and sd of each variable ### sand sim_sand <- qnorm(sLHS[,1], mean = x[[1]], sd = x[[2]]) ### constrained to be between 0 and 1000 sim_sand <- ifelse(sim_sand<0, 0, ifelse(sim_sand>1000, 1000, sim_sand)) ### silt sim_silt <- qnorm(sLHS[,2], mean = x[[3]], sd = x[[4]]) ### constrained to be between 0 and 1000 sim_silt <- ifelse(sim_silt<0, 0, ifelse(sim_silt>1000, 1000, sim_silt)) ### SOC sim_soc <- qnorm(sLHS[,3], mean = x[[5]], sd = x[[6]]) ### constrained to be between 0 and 1000 sim_soc <- ifelse(sim_soc<0, 0, ifelse(sim_soc>1000, 1000, sim_soc)) ### CEC sim_cec <- qnorm(sLHS[,4], mean = x[[7]], sd = x[[8]]) ### constrained to be > 0 sim_cec <- ifelse(sim_cec<0, 0, sim_cec) ### pH sim_ph <- qnorm(sLHS[,5], mean = x[[9]], sd = x[[10]]) ### constrained to be betweeen 0 and 14 sim_ph <- ifelse(sim_ph<0, 0, ifelse(sim_ph>14, 14, sim_ph)) ### Apply the PTF ### Define the functions ### Because from the simulations we cannot guarantee that the sum of texture fractions is 1000, we calculate clay as the difference 1000 - (sand + silt) sim_clay <- 1000 - (sim_sand + sim_silt) ### What if silt and sand sum > 1000? sim_clay <- ifelse(sim_clay <0, 0, sim_clay) ### To calculate the first parameter, SMres, the residual water content (cm³ cm-3), we need sand content, as % ### GSM has sand as g/kg - So the 2% by Toth et al is 20 g/kg here SMres <- ifelse(is.na(sim_sand), NA, ifelse(sim_sand >= 20, 0.041, 0.179)) ### Now for SMs, the saturated water content (cm³ cm-3) ### SMs = 0.5056 - 0.1437 * (1/(OC+1)) + 0.0004152 * Si SMsat <- 0.5056 - (0.1437 * (1/((0.1* sim_soc)+1))) + (0.0004152 * sim_silt * 0.1) #### the parameter alpha #### log10(alpha) = -1.3050 - 0.0006123 * Si - 0.009810 * Cl + 0.07611 * (1/(OC+1)) - 0.0004508 * Si * Cl + 0.03472 * Cl * (1/(OC+1)) - 0.01226 * Si * (1/(OC+1)) logAlpha <- -1.3050 - (0.0006123 * sim_silt * 0.1) - (0.009810 * sim_clay * 0.1) + (0.07611 * (1/((0.1*sim_soc)+1))) - (0.0004508 * sim_silt * sim_clay *0.01) + (0.03472 * sim_clay * 0.1 * (1/((sim_soc*0.1)+1))) - (0.01226 * sim_silt * 0.1 * (1/((sim_soc*0.1)+1))) alpha <- 10^logAlpha ### the parameter n ### log10(n-1) = 0.01516 - 0.005775 * (1/(OC+1)) - 0.24885 * log10(CEC) - 0.01918 * Cl - 0.0005052 * Si - 0.007544 * pH2 ### - 0.02159 * Cl * (1/(OC+1)) + 0.01556 * Cl * log10(CEC) + 0.01477 * (1/(OC+1)) * pH2 + 0.0001121 * Si * Cl - 0.33198 * (1/(OC+1)) * log10(CEC) logN_1 <- 0.01516 - (0.005775 * (1/((sim_soc * 0.1)+1))) - (0.24885 * log10(sim_cec)) - (0.01918 * sim_clay * 0.1) - (0.0005052 * sim_silt * 0.1) - (0.007544 * (sim_ph^2)) - (0.02159 * 0.1 * sim_clay * (1/((0.1 * sim_soc)+1))) + (0.01556 * 0.1 * sim_clay * log10(sim_cec)) + (0.01477 * (1/((0.1 * sim_soc)+1)) * (sim_ph^2)) + (0.0001121 * 0.01 * sim_silt * sim_clay) - (0.33198 * (1/((0.1*sim_soc)+1)) * log10(sim_cec)) n <- (10^logN_1)+1 ### define the soil matric potential (cm of water column) #h <- 101.9716 ### soil matric potential in cm of water column ----- 10 kPa equivalent to 101.9716 cmH2O ### We fit the Mualem-van Genuchten equation for theta at field capacity smFC <- SMres +((SMsat-SMres)/((1+((alpha*101.9716)^n))^(1-(1/n)))) smFC <- ifelse(smFC < 0, 0, smFC) ##### Soil moisture at permanent wilting point ##### thWP = 0.09878 + 0.002127* Cl - 0.0008366 * Si - 0.07670 *(1/(OC+1)) + 0.00003853 * Si * Cl + 0.002330 * Cl * (1/(OC+1)) + 0.0009498 * Si * (1/(OC+1)) smWP <- 0.09878 + (0.002127*sim_clay*0.1) - (0.0008366*sim_silt*0.1) - (0.07670 *(1/((sim_soc*0.1)+1))) + (0.00003853 * sim_silt*sim_clay*0.01) + (0.002330 * sim_clay*0.1 * (1/((sim_soc*0.1)+1))) + (0.0009498 * sim_silt*0.1 * (1/((sim_soc*0.1)+1))) smWP <- ifelse(smWP < 0, 0, smWP) ### Calculate the AWC as difference of both AWC <- smFC - smWP ### what if smWP > smFC? AWC <- ifelse(AWC<0,0,AWC) ### Calculate the PI AWC_lower <-quantile(AWC, probs =.05, na.rm=TRUE) AWC_upper <-quantile(AWC, probs =.95, na.rm=TRUE) smFC_lower <-quantile(smFC, probs =.05, na.rm=TRUE) smFC_upper <-quantile(smFC, probs =.95, na.rm=TRUE) smWP_lower <-quantile(smWP, probs =.05, na.rm=TRUE) smWP_upper <-quantile(smWP, probs =.95, na.rm=TRUE) all_results <- c(AWC_lower, AWC_upper, smFC_lower, smFC_upper, smWP_lower, smWP_upper ) ### Returns a nlayers raster stack. Ordered as smFC, smWP, and AWC return(all_results) } ### Apply function with clusterR ff <- function(x) calc(x, func_lhs) beginCluster(n = 8, type='SOCK') little_test <- clusterR(s, ff, export='func_lhs') endCluster() [[alternative HTML version deleted]] _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo