On 13/01/2024 8:58 p.m., Rolf Turner wrote:
On Sat, 13 Jan 2024 17:59:16 -0500
Duncan Murdoch <murdoch.dun...@gmail.com> wrote:

<SNIP>

My guess is that one of the bootstrap samples had a different
selection of countries, so factor(Country) had different levels, and
that would really mess things up.

You'll need to decide how to handle that:  If you are trying to
estimate the coefficient for Italy in a sample that contains no data
from Italy, what should the coefficient be?

Perhaps NA?  Ben Bolker conjectured that boot() might be able to handle
this.  Getting the NAs into the coefficients is a bit of a fag, but.  I
tried:

My question was really intended as a statistical question. From a statistical perspective, if I have a sampling scheme that sometimes generates sample size 0, should my CI be (-Inf, Inf) for high enough confidence level?

A Bayesian might say that inference should be entirely based on the prior in the case of no relevant data. You could get similar numerical results by adding some fake data to every bootstrap sample, e.g. a single weighted observation for each country at your prior mean for that country, with weight chosen to match the strength of the prior. But Bayesian methods don't give confidence intervals, they give credible intervals, and those aren't the same thing even if they are sometimes numerically similar.

Duncan Murdoch


func <- function(data, idx) {
clyde <- coef(lm(Score~ Time + factor(Country),data=data))
ccc <- coef(lm(Score~ Time + factor(Country),data=data[idx,]))
urk <- rep(NA,length(clyde))
names(urk) <-names(clyde)
urk[names(ccc)] <- ccc
urk
}

It produced a result:

set.seed(42)
B= boot(e, func, R=1000)
B

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = e, statistic = func, R = 1000)


Bootstrap Statistics :
       original     bias    std. error
t1*  609.62500  3.6204052    95.39452
t2*  -54.81250 -1.6624704    36.32911
t3*  -41.33333 -2.7337992   100.72113
t4*  -96.00000 -1.0995718    99.78864
t5* -126.00000 -0.6548886    63.47076
t6*  -26.33333 -1.6516683    87.80483
t7*  -15.66667 -0.8391170    91.72467
t8*  -21.66667 -5.4544013    83.69211
t9*   18.33333 -0.7711001    85.57278

However I have no idea if the result is correct, or even meaningful. I
have no idea what I'm doing.  Just hammering and hoping. 😊️

<SNIP>

cheers,

Rolf


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to