Hi,
I'm pretty new to python and R, and I haven't even done anything similar
(I'm a biologist), so forgive me if this has a trivial solution.
I'm trying to use rpy to make a quick check kind of a tool, to run a few
tests on a column of interest in a spreadsheet against all the other
columns. (I work with large sheets of experimental and clinical data, we
often look for correlations in it as a starting point for study).
I have code that works in R,and I transliterated it as well as I could into
python according to the documentation.
It seems to me that only ~90 columns are read into r when I do it through
rpy, so I only get graphs for those. It also doesn't like it when I tell it
to look at one of the last columns to compare with the rest (the first
while loop).
Since its entirely possible I'm doing something very basic incorrectly,
here's my whole code. The quoted out bit at the end is the r code I was
basing it on.
there are a few commented out things here and there that I was using to
test different things.
Any additional suggestions would also be appreciated.
ps. I dont think it matters, but just in case, I've been using r from
windows and python from ubuntu.

import rpy2.robjects as rob
r=rob.r
#to quickly compare one column of a spreadsheet to all the other columns
with non paratric tests and graph the results.


r('DATA= read.csv("PLACENTAMERGED.csv",header=T)')

data=r.DATA
needcol=True
while needcol== True:
    print "what column are you interested in testing?"
    exp=raw_input()
    try:
        r('y= DATA[, %s]' %exp)
        y=r.y
    except:
        print "Column name not recognized. please type it exactly as it
appears when R reads it"
    else:
        needcol= False

#exp = "SampleID"
r('y= DATA[, %r]' %exp)
y=r.y

kruskal = rob.r['kruskal.test']
plot=rob.r['plot']
mtext=r['mtext']
sp=r['cor.test']
abline=r['abline']
lm=r['lm']

filename="%stests.txt"%exp
info=open(filename, 'a')

rob.r('names =colnames(DATA)')
names=r.names

pval={}
f=0
for x in names:
    rob.r('d <- DATA[, %r]' %x)
    d=r.d
    try:
        result = kruskal(r.y, r.d)
    except:
        continue
    else:
        f=f+1
        info.write("%s\n%s\n\n\n" %(x, result))
        p= float(result[2][0])

        if p <= 1:
            pval[x]=p
        plot(r.d, r.y, xlab=x, ylab=exp)
        mtext("k-w test", 1, side=3, adj=1)
        mtext(p, side=3, adj=1)
        lable='%s%s'%(exp,x)
        r('dev.copy(pdf, "%skw.pdf")'%lable)
        r('dev.off()')


spval={}
sf=0
for x in names:
    rob.r('d <- DATA[, %r]' %x)
    d=r.d
    try:
        result = sp(y, d, method="spearman", exact=False)
    except:
        continue
    else:
        sf=sf+1
        info.write("%s\n%s\n\n\n" %(x, result))
        p= float(result[2][0])

        if p <= 1:
            spval[x]=p
        plot(r.d, r.y, xlab=x, ylab=exp)
        r('l= lm(y~d)')
        l=r.l
        abline(l)
        mtext("spearman", 1, side=3, adj=1)
        mtext(p, side=3, adj=1)
        lable='%s%s'%(exp,x)
        r('dev.copy(pdf, "%ssp.pdf")'%lable)


if len(pval) ==0:
    print "Nothing is significant by the Kruskal-Wallis test"
else:
    print "These groups are significant by the Kruskal-Wallis test"
    for x in pval:
        print "%s, %f\n" %(x,pval[x])
if len(spval) ==0:
    print "Nothing is significant by the Spearman correlation"
else:
    print "These groups are significant by the Spearman correlation"
    for x in spval:
        print "%s, %f\n" %(x,spval[x])


print "done"

"""
DATA<- read.csv("PLACENTAMERGED.csv",header=T)
y=DATA$HME.

#is anything significant?
for(x in colnames(DATA)){
  c<-DATA[,x]
  result<-try(kruskal.test(y~c, data=DATA));
  if(class(result) == "try-error") next;
  if(result[3]<.05) {
    print(x)
    print(result)}
  #else print (result[3]
}
print(done)


#test abd graph. categories
for(x in colnames(DATA)){
  c<-DATA[,x]
  result<-try(kruskal.test(y~c, data=DATA));
  if(class(result) == "try-error") next;
  print(result)
  boxplot(c~y, xlab=x)
  mtext( result[3], side=3, adj=1)

}
print(done)

#test and graph continuous
for(x in colnames(DATA)){
  c<-DATA[,x]
  result<-try(cor.test(y,c,method="spearman", exact=FALSE));
  if(class(result) == "try-error") next;
  print(result)
  summary(c)
  sd(c, na.rm=TRUE)
  plot(c~y, xlab=x)
  abline(lm(y~c))
  mtext( result[3], side=3, adj=1)

}
print("done")


"""
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to