I just looked at a couple of files at NCSU and they looked like the
original files on the disk supplied with the book.  Has anyone found
ones there that are NOT as originally supplied?  Here is an example
(made up) of the kind of problems in the original files.  You might
have a data set with three variables, x and y quantitative, and z
categorical with three groups.  The data file looks just like the
table in the book: six columns, x,y,x,y,x,y with the pairs matching
z=a,b,c.  So the given data have to be stacked and the categorical
variable created.  Not too horrendous if you just want to use that one
datsaset but virtually all the files have similar (often worse)
problems, i.e., you cannot read them into an R dataframe as-is.  You
can find other actual examples in my review

  Review of Two Collections of Data for Use in a First Course in
  Statistics, The American Statistician, Vol.50, No.2 (May 1996), 
  pp. 168-169.
  
I cleaned up and used about 20 datasets myself.  At the time I wrote
the review I had fantasies of finding 25 others who would each
volunteer to clean another 20 each.  I had long given up on that when
Dennis surprised me by offering far more than 20.  So I will be
working with him and will also look at the NCSU versions again. I have
also had others volunteer bits and pieces so I hope we will soon have
all or most in useable form.

PS

While R gurus may feel the problem is minor FOR THEM, I had hoped
to use the book in the following way.  After teaching topic X in a
gen. ed. intro. course, ask students to pick a data set of interest to
them and analyze it using X.  Beginners will not even notice the data
are not in standard format, and will spend lots of time wrestling with
the software to get usable output rather than focussing on the
statistical concepts.  I work a lot with high school teachers of
AP Statistics who themselves have usually taken 1 plus/minus 1
statistics courses and have NO experience with real data.  I would
LOVE to recommend this book to them but they would circle my home and
burn it to the ground after a few hours wrestling with the form the
data are in now.   

Forwarded message:
> From [email protected] Thu Jan 31 05:01:43 2013
> X-Original-To: [email protected]
> X-Csoft-Rule: spam<=5.0
> X-Spam-Checker-Version: SpamAssassin 3.3.2-csoft38 (2011-06-06) on
>       ubar.csoft.net
> X-Spam-Level: 
> X-Spam-Status: No, score=-6.8 required=6.0 
> tests=BAYES_00,DKIM_ADSP_CUSTOM_MED,
>       DKIM_SIGNED,RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham
>       version=3.3.2-csoft38
> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
>       h=mime-version:x-received:in-reply-to:references:date:message-id
>       :subject:from:to:cc:content-type;
>       bh=qO6E7XG6EOx1anYr8Vdu9ic3wJ3HlmmVmsDdeDp8lS8=;
>       b=Siev/NyEGz+FX4xqHlcgysciCyl9YjsiaYSwPNHuemdWdz5Zl3OuZ08lADtFz5HGrD
>       8y5r8YLyym+jdkq6zQAEMK7Bfb227z+iwCxfc3qru720Ql1gdodEHShfDuOJtgAN/ujQ
>       LzF8tI2wvWRmrqAih3Pjh+BhbEEnF3SWVaXkzFl4P1t0QxFIWiJeC9QDalCHyDovO0RA
>       Ks8pKKNJb/dJOerBXhYD4aihb9xuBNBX7UVo2o6Eg/wxBvKLtbuU9XqBQz8074vsDa4U
>       UNbxGiXVGa0DpdZKoFlRPgNgkszzfzt12G3e9TGZiAxg4AoPDUWYpqdK24u7UURPsNSn
>       85hQ==
> MIME-Version: 1.0
> X-Received: by 10.42.58.67 with SMTP id g3mr6504459ich.56.1359626335114; Thu,
>       31 Jan 2013 01:58:55 -0800 (PST)
> In-Reply-To: <[email protected]>
> References: <[email protected]>
>       <[email protected]>
> Date: Thu, 31 Jan 2013 01:58:54 -0800
> Message-ID: 
> <CADv2QyGSqGVrJSET-Qah2AdagLWiOpy3h5=r_owkespu-ft...@mail.gmail.com>
> From: Dennis Murphy <[email protected]>
> To: Jeff Laux <[email protected]>
> X-Tag-Only: YES
> X-Filter-Node: phil2.ethz.ch
> X-USF-Spam-Level: 
> X-USF-Spam-Status: hits=-0.7 tests=FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS,
>       T_DKIM_INVALID
> X-USF-Spam-Flag: NO
> X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
> Cc: [email protected]
> Subject: Re: [R-sig-teaching] Handbook of Small Datasets
> X-BeenThere: [email protected]
> X-Mailman-Version: 2.1.14
> Precedence: list
> List-Id: SIG on Teaching Statistics using R <r-sig-teaching.r-project.org>
> List-Unsubscribe: <https://stat.ethz.ch/mailman/options/r-sig-teaching>,
>       <mailto:[email protected]?subject=unsubscribe>
> List-Archive: <https://stat.ethz.ch/pipermail/r-sig-teaching>
> List-Post: <mailto:[email protected]>
> List-Help: <mailto:[email protected]?subject=help>
> List-Subscribe: <https://stat.ethz.ch/mailman/listinfo/r-sig-teaching>,
>       <mailto:[email protected]?subject=subscribe>
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Errors-To: [email protected]
> Sender: [email protected]
> 
> That's what the book is for: its purpose is to describe the variables
> and context of each data set. The book 'Data' by Andrews and Herzberg
> (1985) is similar in that respect. As I mentioned to Bob privately, I
> thought about making a R package of the data sets in HDLMO several
> years ago because I used a number of them in teaching, but then
> realized that if I wrote the help pages, I'd essentially be violating
> the copyright of the book...so that project died. But I do have a
> collection of R objects for the data sets which I'm editing and hope
> to finish before the weekend is out. Bob prefers a zipped csv archive,
> but I can make an R binary available (or a zipped version of .Rdata
> files) if anyone is interested.
> 
> Dennis
> 
> On Wed, Jan 30, 2013 at 7:54 PM, Jeff Laux <[email protected]> wrote:
> > Yes.  They can be found on NC State's Statistics department's website:
> >
> >      http://www.stat.ncsu.edu/working_groups/sas/sicl/data/
> >
> > However, the accompanying stories don't exist.  What is posted is just tab
> > delimited text files with numeric data.  Someone else will have to say what
> > the numbers are supposed to mean.
> >
> >
> >
> > On 1/30/2013 6:48 PM, Bob wrote:
> >>
> >> Just saw a mention of _Handbook of Small Datasets_.  Does anyone know
> >> if the data files ever got cleaned up and posted on the Internet?  I
> >> bought this when I came out and the disk included files that seemed to
> >> be created by cut and paste from the manuscript.  This meant that the
> >> "shape" of the data matched a typesetter's needs rather than a
> >> statistician's.  Most of the datasets needed considerable manual work
> >> before one could hand them off to students.  (I DID find what appeared
> >> to be the original disfunctional versions online.)  It's really sad
> >> that a collection that was such a good idea on paper was so poorly
> >> implemented.
> >>
> >>
> >> ------->  First-time AP Stats. teacher?  Help is on the way! See
> >> http://courses.ncssm.edu/math/Stat_Inst/Stats2007/Bob%20Hayden/Relief.html
> >>        _
> >>       | |          Robert W. Hayden
> >>       | |          142 Main Street
> >>      /  |          Apartment 104
> >>     |   |          Jaffrey, New Hampshire 03452  USA
> >>     |   |          email: bob@ the site below
> >>    /    |          website: http://statland.org
> >>   | x   /          phone: (603) 532-7224 (home)
> >>   ''''''
> >>
> >> _______________________________________________
> >> [email protected] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
> >>
> >
> > _______________________________________________
> > [email protected] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
> 
> _______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
> 
> 


------->  First-time AP Stats. teacher?  Help is on the way! See
http://courses.ncssm.edu/math/Stat_Inst/Stats2007/Bob%20Hayden/Relief.html
      _
     | |          Robert W. Hayden
     | |          142 Main Street
    /  |          Apartment 104
   |   |          Jaffrey, New Hampshire 03452  USA
   |   |          email: bob@ the site below
  /    |          website: http://statland.org
 | x   /          phone: (603) 532-7224 (home)
 ''''''

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

Reply via email to