Hi Thanks for making the data available.
I'm an acedamic researcher and I have downloaded the dataset; however, I am struggling to understand the meaning/values of some of the columns. Is there any description or a readme file for the dataset? The table parsed_logs contain file names such as c136315a_00001.bz2, are these files available? Best regards, Nawaf On Friday, February 7, 2014 at 9:46:42 AM UTC-8, Gwern Branwen wrote: > Hello everyone. As you know, by default Mnemosyne collects logs of all > flashcard reviews, and has done so for years. > > This seems like it could be useful data for some projects (see for > example my earlier emails about time-of-day and -week on performance), > but it's difficult to get access to the corpus because it has become > far too large to email or casually host; Peter provided a torrent ~4 > years ago but it died long ago. One can email him for a copy but that > takes up his time and in any case, the logs still have to processed > into a SQL database (I tried recently but after 3 months of > processing, it still hadn't finished because of the IO bottleneck of > my hard drive). I've taken it upon myself to take a recent dump of > logs from Peter, process them into a SQL database (~1 day with my new > SSD), and upload them to my Amazon S3 account where anyone can > download them. > > The link is > https://s3.amazonaws.com/gwern-mnemosyne/2014-01-27-mnemosynelogs-all.db.xz > (due to the size, I suggest a download manager like `wget > --continue`). > > This is a 2.8GB file compressed with xz > (https://en.wikipedia.org/wiki/Xz) which `unxz`/unpacks to an 18GB > SQLite 3.x database with the MD5 hash 03569c5416dd6923613389be6d0cc9e1 > It can be queried with commands like `$ sqlite3 -batch ./logs.db > "SELECT timestamp,object_id,grade FROM log WHERE event==9;"` or via > SQL interfaces like 'sqldf' for R. > > I commit to keeping the file up for 3 months before removing it, since > S3 bandwidth is not free; if you'd like to see it stay longer, I > accept Bitcoin donations at 1HbHpdhazqzfPtbcw9NA2H9R1GWNekm1L > > -- > gwern > http://www.gwern.net/Spaced%20repetition -- You received this message because you are subscribed to the Google Groups "mnemosyne-proj-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/mnemosyne-proj-users/a519cbdb-98a1-489d-9c70-efe9c11f96ea%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
