Hi Byron; thanks very much for this. We threw it in front of a statistician, and got this:

With the caveat that I'm not actually a biostatistician, I think year is the wrong "treatment" here. You either want to

 1. binarize before/after the major shift in instructor training
    procedure (note this has major confounding issues with time, and
    thus popularity and size of data carpentry, etc), or
 2. compare all the actual training sessions. There are a lot of them,
    which may destroy your power, but if you see some that are way
    low, you can look at them and see if they were, e.g., all taught
    by the same person, or have other characteristics in common.

To do this type of modelling I think I'd really want to have more covariates to put in the model though, either at the training session or trainee level. The more (true, relevant) information you give the model the better it can answer your question, and it seems pretty starved for info if you're JUST giving it year...


I can easily label sessions as "two-day" or "multi-week", which is the major distinguishing characteristic. I don't think we'll get much signal yet from labeling by instructor, since I taught or co-taught everything before January, and we've only had 4 since then that were solely taught by other people (a number I sincerely hope will go up). But this is still pretty cool - I'll see if I can cook a better data set.

Cheers,

Greg


On 2016-05-23 4:41 PM, Byron Smith wrote:
Could someone take a look at this survival analysis of the same data [1]? I'm by no means an expert, so I'd like to know if I'm doing anything obviously wrong.

[1]: http://bsmith89.github.io/swc-instructor-training-analysis/


--
Dr Greg Wilson
Director of Instructor Training
Software Carpentry Foundation

_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/listinfo/discuss

Reply via email to