Maarten,

Fascinating question!  Yes, you should not see that behaviour.   Median has 
nothing to do with the issue, because you should get linear relationships for 
any constant value of the non-essential covariates.

And, thanks for telling me about SpatialFiltering.  I have been using PCNM() in 
the vegan  package to extract Moran Eigenvector Maps.  I use spdep for lots of 
things but hadn't (yet) discovered SpatialFiltering().  Good to find out about 
other implementations!

For the benefit of the list, Stephane Dray has a wonderful vignette on the use 
of Moran Eigenvector Maps (MEM) to "pull" spatial covariance out of the error 
covariance into the fixed effect model structure.  Pedro Peres-Neto used the 
name principal components of neighbor matrices when he popularized the method 
in the ecological literature but MEM is earlier.  Stephane uses functions in 
adespatial, but the principles apply equally well to the vegan implementation 
(pcnm) or spdep implementation (SpatialFiltering).   Her vignette is  
https://cran.r-project.org/web/packages/adespatial/vignettes/tutorial.html

I explored various possible explanations for your issue without success.  These 
included:
High correlation between elev and intercept
Coding the factor variable (soil)
All were ruled out when I constructed alternatives that showed the same problem.

Eventually, I decided that the issue was something subtle with predict when you 
have a matrix argument (EV_fit1 in your code) in the formula.  It looks like 
predict then ignores the newdata= argument and gives you predictions for the 
original observations.  Since those don't have the same values for 
"non-essential" covariates, it is no surprise that you get a random plot.  This 
is not apparent when you predict for the same number of observations as the 
original data set, but it is immediately apparent when you have a different 
number of rows in the newdata.

I simplified the problem by extracting the first two MEM eigenvectors and 
explicitly naming them in the formula.  Here's the code (using base graphics).  
It gives straight lines for the prediction vs elev plot, as expected.  This is 
not a reprex.  It piggy-backs on your code.

meuse$sp1 <- EV_fit1[,1]
meuse$sp2 <- EV_fit1[,2]

Model_example<-lm(cadmium ~ elev + dist  + factor(soil) + sp1 + sp2,
  data=meuse)

# bad predictions using the meuse data
meuse$predmod1 <- predict(Model_example, type="response")

with(meuse, plot(elev, predmod1, col=as.integer(soil)))
# this plot shows all sorts of variation vs elev because the other covariates 
are not constant

# predictions using a newly constructed data set
testdata <- expand.grid(elev=seq(5,10, 0.1), soil=1:3)
testdata$soil <- factor(testdata$soil)

# You can choose any constant you like for dist, sp1 and sp2
testdata$dist <- 0.211
testdata$sp1 <- median(meuse$sp1)
testdata$sp2 <- median(meuse$sp2)

testdata$predmod1 <- predict(Model_example, newdata=testdata)

with(testdata, plot(elev, predmod1, col=as.integer(soil)))
# this plot shows straight lines vs elev because other covariates are held 
constant, as expected

Best,
Philip Dixon


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to