-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On Sat, Apr 25, 2009 at 12:06 PM, Bill Price wrote: -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.5)
iEYEAREKAAYFAkn0bcMACgkQvpDo5Pfl1oICsgCggWH0ejkLH07l66DSBsH5K39s HxsAn0Twpg0xxsoBe9ODzl4OfMcNWsAi =pas+ -----END PGP SIGNATURE----- ... > My advisor killed my idea to have a third comparison group of non > software users. I still disagree with that decision, but her argument > was essentially that any difference would be trivially explained by > the fact that my intervention amounted to a "studying aid." She wanted > my only comparisons to be between two groups with just one variable > modified between them, namely the utilization of spaced repetition > scheduling. I still think it would be useful knowledge to at least SEE > where the rest of the class was on this material at those two > evaluation points, though; so as I said, I disagree with her reasoning > and decision. That's too bad; wouldn't it have been really easy to have another group you just did pre- and post-tests on? Oh well. > Also, to your point about the score increases: it is true that the > average scores for the two groups increased in roughly-equivalent > magnitudes—about 12 points for the spaced group and about 10 points > for the intuitive group. However, I specifically said that the spaced > group showed a SIGNIFICANT (as in, p < .05) increase. The difference > for the intuitive group was NOT statistically significant, largely due > to the massive variance of the group. One reason the intuitive group > has such a high average improvement, actually, is that one individual > showed an improvement of 42 points between pre-assessment and post- > assessment, which is a wholly anomalous score. My assumption is that > that individual simply hadn't learned the material before the pre- > assessment was administered, and so failed it terribly. Yes, that's true. My bad. I'd ask how they stack up if you remove that one guy, but I remember vaguely you covered that. :) > On thing that must be cleared up is that 32.6% figure you tossed out. > > Out of the 9 sub-decks the participants were given, only 2 of them > (decks 2 and 3) were tested on the pre-assessment and the post- > assessment. This was to buy time for any benefits due to the spaced > repetition schedule to manifest. My assumption was that subjects would > not place much emphasis on studying such "old" material over the > course of the experiment, and so the intuitive group would be expected > to study decks 2 and 3 heavily at the start of the experiment and then > to drop off for the rest of the study. > > The 32.6% figure meant that, during the time period that I tracked, > the spaced repetition group studied 32.6% more cards from decks 2 and > 3 than the intuitive repetition group did. > > Recall that I was unable to collect adequate learning data for the > first five days of the study, though. Thus, I do not know how many > times members of either group studied decks 2 and 3 during that time. > That was the time in which I predicted the intuitive group would study > those decks the heaviest, so it MAY be the case that the total trials > of the assessment-relevant material between the groups was near-equal. > If I could show that, then my results would be much stronger, but I > simply lack the data. OK, that's interesting. I don't consider it too bad from our (spaced partisans') view though - even if the intuitives had studied so hard as to make up the repetition gap, they still wound up with inferior results. > You are correct in saying that I should have given a more full account > of subject compliance. I will draw up a table and put it in my > Appendix D. (Cool. See me other email.) > One other point: One reason I decided not to do any analyses of the > easiness factors assigned within or between groups is that subjects > showed rather different opinions of what each grade meant. This was > especially prevalent in the intuitive group; one of the intuitive > subjects began by only assigning grades of 0 or 4, for instance. I > tried to do running checks of everyone's learning data to ensure that > the grading distributions seemed reasonable and to alert people to any > issues I had with their grading, but on such a subjective task, it's > difficult to know what's "correct." Also, to be fair, the two groups > had different ideas of what the grades were for. The spaced repetition > group knew the purpose behind them, but the intuitive repetition > subjects had no clear idea what the grades were for (beyond some > general idea I gave that the grades would "help us determine which > characters in the decks were especially difficult to remember," etc). I'm not sure it's *that* hopeless. Even if people are doing odd schemes like assigning only 0s and 4s, you could still capture this by taking 0-2 as hard, 3 as medium, and 4-5 as easy. But I suppose if there were any overall bias in either group, you would have noticed it during your running checks... Another thing for future research, I suppose. > —Bill -- gwern --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mnemosyne-proj-users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mnemosyne-proj-users?hl=en -~----------~----~----~----~------~----~------~--~---
