The datsets are also available on the publisher's website. I just compared several files among those, the versions at NCSU, and the version that came on a floppy with the book when it first came out. File sizes are exactly the same and for the files I looked inside the layout is sheer madness in all three versions.
Forwarded message: > > > I just looked at a couple of files at NCSU and they looked like the > original files on the disk supplied with the book. Has anyone found > ones there that are NOT as originally supplied? Here is an example > (made up) of the kind of problems in the original files. You might > have a data set with three variables, x and y quantitative, and z > categorical with three groups. The data file looks just like the > table in the book: six columns, x,y,x,y,x,y with the pairs matching > z=a,b,c. So the given data have to be stacked and the categorical > variable created. Not too horrendous if you just want to use that one > datsaset but virtually all the files have similar (often worse) > problems, i.e., you cannot read them into an R dataframe as-is. You > can find other actual examples in my review > > Review of Two Collections of Data for Use in a First Course in > Statistics, The American Statistician, Vol.50, No.2 (May 1996), > pp. 168-169. > > I cleaned up and used about 20 datasets myself. At the time I wrote > the review I had fantasies of finding 25 others who would each > volunteer to clean another 20 each. I had long given up on that when > Dennis surprised me by offering far more than 20. So I will be > working with him and will also look at the NCSU versions again. I have > also had others volunteer bits and pieces so I hope we will soon have > all or most in useable form. > > PS > > While R gurus may feel the problem is minor FOR THEM, I had hoped > to use the book in the following way. After teaching topic X in a > gen. ed. intro. course, ask students to pick a data set of interest to > them and analyze it using X. Beginners will not even notice the data > are not in standard format, and will spend lots of time wrestling with > the software to get usable output rather than focussing on the > statistical concepts. I work a lot with high school teachers of > AP Statistics who themselves have usually taken 1 plus/minus 1 > statistics courses and have NO experience with real data. I would > LOVE to recommend this book to them but they would circle my home and > burn it to the ground after a few hours wrestling with the form the > data are in now. > > Forwarded message: > > From [email protected] Thu Jan 31 05:01:43 2013 > > X-Original-To: [email protected] > > X-Csoft-Rule: spam<=5.0 > > X-Spam-Checker-Version: SpamAssassin 3.3.2-csoft38 (2011-06-06) on > > ubar.csoft.net > > X-Spam-Level: > > X-Spam-Status: No, score=-6.8 required=6.0 > > tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, > > DKIM_SIGNED,RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham > > version=3.3.2-csoft38 > > DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; > > s=20120113; > > h=mime-version:x-received:in-reply-to:references:date:message-id > > :subject:from:to:cc:content-type; > > bh=qO6E7XG6EOx1anYr8Vdu9ic3wJ3HlmmVmsDdeDp8lS8=; > > b=Siev/NyEGz+FX4xqHlcgysciCyl9YjsiaYSwPNHuemdWdz5Zl3OuZ08lADtFz5HGrD > > 8y5r8YLyym+jdkq6zQAEMK7Bfb227z+iwCxfc3qru720Ql1gdodEHShfDuOJtgAN/ujQ > > LzF8tI2wvWRmrqAih3Pjh+BhbEEnF3SWVaXkzFl4P1t0QxFIWiJeC9QDalCHyDovO0RA > > Ks8pKKNJb/dJOerBXhYD4aihb9xuBNBX7UVo2o6Eg/wxBvKLtbuU9XqBQz8074vsDa4U > > UNbxGiXVGa0DpdZKoFlRPgNgkszzfzt12G3e9TGZiAxg4AoPDUWYpqdK24u7UURPsNSn > > 85hQ== > > MIME-Version: 1.0 > > X-Received: by 10.42.58.67 with SMTP id g3mr6504459ich.56.1359626335114; > > Thu, > > 31 Jan 2013 01:58:55 -0800 (PST) > > In-Reply-To: <[email protected]> > > References: <[email protected]> > > <[email protected]> > > Date: Thu, 31 Jan 2013 01:58:54 -0800 > > Message-ID: > > <CADv2QyGSqGVrJSET-Qah2AdagLWiOpy3h5=r_owkespu-ft...@mail.gmail.com> > > From: Dennis Murphy <[email protected]> > > To: Jeff Laux <[email protected]> > > X-Tag-Only: YES > > X-Filter-Node: phil2.ethz.ch > > X-USF-Spam-Level: > > X-USF-Spam-Status: hits=-0.7 tests=FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, > > SPF_PASS, > > T_DKIM_INVALID > > X-USF-Spam-Flag: NO > > X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch > > Cc: [email protected] > > Subject: Re: [R-sig-teaching] Handbook of Small Datasets > > X-BeenThere: [email protected] > > X-Mailman-Version: 2.1.14 > > Precedence: list > > List-Id: SIG on Teaching Statistics using R <r-sig-teaching.r-project.org> > > List-Unsubscribe: <https://stat.ethz.ch/mailman/options/r-sig-teaching>, > > <mailto:[email protected]?subject=unsubscribe> > > List-Archive: <https://stat.ethz.ch/pipermail/r-sig-teaching> > > List-Post: <mailto:[email protected]> > > List-Help: <mailto:[email protected]?subject=help> > > List-Subscribe: <https://stat.ethz.ch/mailman/listinfo/r-sig-teaching>, > > <mailto:[email protected]?subject=subscribe> > > Content-Type: text/plain; charset="us-ascii" > > Content-Transfer-Encoding: 7bit > > Errors-To: [email protected] > > Sender: [email protected] > > > > That's what the book is for: its purpose is to describe the variables > > and context of each data set. The book 'Data' by Andrews and Herzberg > > (1985) is similar in that respect. As I mentioned to Bob privately, I > > thought about making a R package of the data sets in HDLMO several > > years ago because I used a number of them in teaching, but then > > realized that if I wrote the help pages, I'd essentially be violating > > the copyright of the book...so that project died. But I do have a > > collection of R objects for the data sets which I'm editing and hope > > to finish before the weekend is out. Bob prefers a zipped csv archive, > > but I can make an R binary available (or a zipped version of .Rdata > > files) if anyone is interested. > > > > Dennis > > > > On Wed, Jan 30, 2013 at 7:54 PM, Jeff Laux <[email protected]> wrote: > > > Yes. They can be found on NC State's Statistics department's website: > > > > > > http://www.stat.ncsu.edu/working_groups/sas/sicl/data/ > > > > > > However, the accompanying stories don't exist. What is posted is just tab > > > delimited text files with numeric data. Someone else will have to say > > > what > > > the numbers are supposed to mean. > > > > > > > > > > > > On 1/30/2013 6:48 PM, Bob wrote: > > >> > > >> Just saw a mention of _Handbook of Small Datasets_. Does anyone know > > >> if the data files ever got cleaned up and posted on the Internet? I > > >> bought this when I came out and the disk included files that seemed to > > >> be created by cut and paste from the manuscript. This meant that the > > >> "shape" of the data matched a typesetter's needs rather than a > > >> statistician's. Most of the datasets needed considerable manual work > > >> before one could hand them off to students. (I DID find what appeared > > >> to be the original disfunctional versions online.) It's really sad > > >> that a collection that was such a good idea on paper was so poorly > > >> implemented. > > >> > > >> > > >> -------> First-time AP Stats. teacher? Help is on the way! See > > >> http://courses.ncssm.edu/math/Stat_Inst/Stats2007/Bob%20Hayden/Relief.html > > >> _ > > >> | | Robert W. Hayden > > >> | | 142 Main Street > > >> / | Apartment 104 > > >> | | Jaffrey, New Hampshire 03452 USA > > >> | | email: bob@ the site below > > >> / | website: http://statland.org > > >> | x / phone: (603) 532-7224 (home) > > >> '''''' > > >> > > >> _______________________________________________ > > >> [email protected] mailing list > > >> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching > > >> > > > > > > _______________________________________________ > > > [email protected] mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-sig-teaching > > > > _______________________________________________ > > [email protected] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-sig-teaching > > > > > > > -------> First-time AP Stats. teacher? Help is on the way! See > http://courses.ncssm.edu/math/Stat_Inst/Stats2007/Bob%20Hayden/Relief.html > _ > | | Robert W. Hayden > | | 142 Main Street > / | Apartment 104 > | | Jaffrey, New Hampshire 03452 USA > | | email: bob@ the site below > / | website: http://statland.org > | x / phone: (603) 532-7224 (home) > '''''' > > _______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-teaching > > -------> First-time AP Stats. teacher? Help is on the way! See http://courses.ncssm.edu/math/Stat_Inst/Stats2007/Bob%20Hayden/Relief.html _ | | Robert W. Hayden | | 142 Main Street / | Apartment 104 | | Jaffrey, New Hampshire 03452 USA | | email: bob@ the site below / | website: http://statland.org | x / phone: (603) 532-7224 (home) '''''' _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
