Hello All,

I have revised the faithful.py demo from the RPy web site to run using 
rpy2.    I have tried to minimize changes to the original program.  This 
version runs on openSUSE 11.2 (x86_64) using Python 2.6, rpy2 version 
2.1.alpha 3,   and R version 2.10.1. 

I am posting this to the rpy list because I thought it might be useful to 
get expert input on preferred rpy2 syntax.  Is this example representative 
of how you prefer to use the rpy  2.1 interface to R?   If not, how might 
I change it so it represents the preferred usage? 

I have provided a copy to Laurent Gautier.  He has made some suggestions 
that I have yet to incorporate.  I believe he would like to include this 
version (or some variation of it) in the new RPy 2.1 documentation.   So 
now would be a good time establish the "preferred" RPy usage, if such a 
thing exists!

Thanks in advance for you input!

John A. Schroeder
Idaho National Laboratory
Battelle Energy Alliance, LLC
P.O. Box 1625
Idaho Falls, ID  83415-3850

Ph:   208-526-8755
FAX:  208-526-2930


#
#  This is a rpy2 (2.1.alpha) version of the demo at
#  http://rpy.sourceforge.net/rpy_demo.html
#
#  I have done my best to preserve the code originally provided by
#  Tim Churches.  There may be a few places where some edits might be
#  in order to represent preferred rpy2 syntax. If so, I would appreciate
#  the feedback.  I have done this as a learning exercise, and donate it
#  to the RPy community.
#
#  John A. Schroeder
#  snakeriver...@gmail.com
#

import rpy2.robjects as robjects

def main():

    r = robjects.r

    faithful_data = {"eruption_duration":[],
                     "waiting_time":[]}

    f = open('faithful.dat','r')

    for row in f.readlines()[1:]: # skip the column header line
        splitrow = row[:-1].split(" ")
        faithful_data["eruption_duration"].append(float(splitrow[0]))
        faithful_data["waiting_time"].append(int(splitrow[1]))

    f.close()

    ed = robjects.FloatVector(faithful_data["eruption_duration"])
    edsummary = r.summary(ed)
    print "Summary of Old Faithful eruption duration data"
    for k in edsummary.names:
        print k + ": %.3f" % edsummary.rx(k)[0]
    print
    print "Stem-and-leaf plot of Old Faithful eruption duration data"
    print r.stem(ed)

    r.png('faithful_histogram.png',width=733,height=550)
    r.hist(ed,r.seq(1.6, 5.2, 0.2), prob=1,col="lightgreen",
           main="Old Faithful eruptions",xlab="Eruption duration 
(seconds)")
    r.lines(r.density(ed,bw=0.1),col="orange")
    r.rug(ed)
    dev_off = robjects.r('dev.off')
    dev_off()

    long_ed = robjects.FloatVector(filter(lambda x: x > 3, ed))
    r.png('faithful_ecdf.png',width=733,height=550)
    r.library('stats')
    params = {'do.points' : False,
              'verticals' : 1,
              'main' : "Empirical cumulative distribution function of " +
              "Old Faithful eruptions longer than 3 seconds"}
    r.plot(r.ecdf(long_ed), **params)
    x = r.seq(3,5.4,0.01)
    r.lines(x,r.pnorm(x,mean=r.mean(long_ed),
            sd=r.sqrt(r.var(long_ed))), lty=3, lwd=2, col="red")
    dev_off = robjects.r('dev.off')
    dev_off()

    r.png('faithful_qq.png',width=733,height=550)
    r.par(pty="s")
    r.qqnorm(long_ed,col="blue")
    r.qqline(long_ed,col="red")
    dev_off = robjects.r('dev.off')
    dev_off()

    print
    print "Shapiro-Wilks normality test of Old Faithful eruptions longer 
than 3 seconds"
    shapiro_test = robjects.r('shapiro.test')
    sw = shapiro_test(long_ed)
    print "W = %.4f" % sw.rx('statistic')[0][0]
    print "p-value = %.5f" % sw.rx('p.value')[0][0]

    print
    print "One-sample Kolmogorov-Smirnov test of Old Faithful eruptions 
longer than 3 seconds"
    ks_test = robjects.r('ks.test')
    ks = ks_test(long_ed,"pnorm", mean=r.mean(long_ed), 
sd=r.sqrt(r.var(long_ed)))
    print "D = %.4f" % ks.rx('statistic')[0][0]
    print "p-value = %.4f" % ks.rx('p.value')[0][0]
    print "Alternative hypothesis: %s" % ks.rx('alternative')[0][0]
    print

    # On linux, if you are plotting to X11, this is necessary to keep 
plots
    # visible.  They disappear when the program terminates (not so on
    # Windows)
    # end = raw_input("Hit any key to end: ")

if __name__ == '__main__': main()

------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to