##This tutorial shows a little about numpy, rpy2, and their interaction.  To use the tutorial, just go through and execute at the python prompt the python commands shown below, one by one, after reading each little piece of commentary.  You have to have numpy and rpy2 installed. 


####Numpy#####

import numpy
#numpy has a bunch of matrix and vector constructors
X_1D = numpy.array(range(30))
X_2D = numpy.random.randint(10,size=(50,30))
X_3D = numpy.random.randint(10,size=(50,30,50))
#numpy can construct arrays from python objects like lists and lists of lists
X = numpy.array([range(i,i+10) for i in range(10)])
#coordinate indexing is easy
X_1D[2]
X_2D[2,3]
X_3D[2,3,4]
#slicing is easy
X[2:5,3:7]
#you can easy slice by lists:
X_1D[[2,3,4]]
#and by boolean arrays
X[X[2,:] > 4]
X[(X[2,:] > 4) & (X[3,:] == 0)]
#assignment is easy
X[(X[2,:] > 4)] = -100
#many mathemtical things are available:  
Y = X.T
#There's a huge variety of mathematical things that numpy allows you to to matrices (e.g. linear algebra, fourier analysis, etc...).   you can find out by reading the numpy documentation or searching on the web. 

#there's also a "rec array" object with referencable headers
X = numpy.rec.fromrecords([('AAACTG','TTGAC',33.3),('CCAT','ACCTGT',1.2),('CCCAT','ACCGT',-1.2)],names=['Motif1','Motif2','Corr'])
X['Corr']; X['Motif1']
X = numpy.rec.fromarrays([['AACCT','TTGTTA','TGTTT'],['TTGTT','TTGGAA','ACCTT'],[33.3,-1.2,222.3]],names=['Motif1','Motif2','Corr'])

#of course you can turn a numpy object into a regular python object
Z = X_1D.tolist()
Z = X['Motif1'].tolist()

#COMMENT:  I think you'll see that for the syntax of matrix manipulations, numpy is much better than R -- the point of using R even if you have numpy (and therefore, the R-Python bridge module rpy2), is that many statistical modules are available in R that are not collected in one unified place in pure python libraries. 


####rpy2#####

#now let's look into rpy2
import rpy2.robjects as RO #<-- 'RO' is the basic module of r stuff
r = RO.r  #<- 'r' is the instance of the r program running embedded in your python terminal

#you can make a simple r object from a python object:
V = RO.IntVector(range(10))
print V

#it's easy to translate rpy2 objects back into python objects. All you need to do is apply the basic python object constructor functions, like list() or tuple(), direcly to rpy2 objects.  
PyV = list(V)    
PyV = tuple(V)

#some basic r functions are accessible as attributes of the r instance, ry2.robjects.r:
r.pchisq
r.rnorm
r.sum

#more generally, R functions are avabile by the correspondence that the R function 'fn'  corresponds to the python function "r['fn']"
#for example:
r['fisher.test']
r['matrix']

#all these functions are callable just like any other python functions -- the arguments are literals or rpy2 objects, and are the identical to what the arguments would be if the corresponing function was being called within R itself.   
r.sum(V)
r.rnorm(V)
M = r['matrix'](V,nrow = 2)
FTest = r['fisher.test'](M)

#[COMMENT:  what's happening when these rpy2 functions are called is that their arguments are literally passed to a running R-console instance in your computer's memory, and then evaluated there directly as R functions, operating on the R objects;  this is NOT an _emulation_ of R  - it is actual R running, using whatever copy of R you already have installed on your computer.  the 'instance' of R running is accesied by the object 'r' -- see below for more details]  

#rpy2 objects "pretty-print" at the python prompt
print M
print FTest

#R indexing is done via the "subset" method:
M.subset(2,3)

#getting numpy objects into r is easy -- you have to import the "numpy2ri" module from rpy2.robjects.
import RO.numpy2ri   

#Once you've imported numpy2ri, you can pass numpy objects directly to rpy2 constructor functions:
X = numpy.random.randint(10,size=(50,30))
x = RO.RMatrix(X)   #IMPORTANT:  this line will NOT work unless you'v imported numpy2ri

#getting rpy2 objects back into numpy is trivial: just feed them to the numpy constructor functions, just like with any other python object:
z = numpy.array(x)



####A little more about the thing I've been calling the "R instance":

#When you assign:
r = RO.r
#in rpy2, what you're doing in effect is two things: 
#      1)  STARTING a new copy of the R  console, as if you had opened a terminal and typed "R" --- but doing it invisibly in the computer's memory
#      2) and THEN, giving yourself access to that running R console, VIA the the python variable 'r'

#In general,  there are two methods of interacting with the R console via rpy2:
#       I) the command r['blah'] functions like RETURNING the variable blah from the R console.
#		 II) the command r('blah') functions like EXECUTING the command string 'blah' at the R console AND THEN RETURNING the result

#The point of the first method is that it gives you access to already-existing functions or variables in the R console, so that you can do thing with them as if they were python obects -- but no new R functions or variables will be assigned 

#The point of the second method is that it allows you to define new R functions or variables, and have them hang out in the R-console instance for future use 

#for examples of I):
#this sets the variable 'RMat' to be the R 'matrix' function:
RMat = r['matrix']       #note:  this is just like any other python assignment

#so that you can now use it as if it were a python function:
M = RMat(V,nrow = 2)   ## this is identical to something we did above 

#However, you have to have the R function 'matrix' _already defined_ at the R console for this kind of thing to work (in this case, since 'matrix' is a built-in preloaded R function, it's no problem)

#on the other hand, as an examples of method II),  this command:
X = r('ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)')
#does TWO things:  first, it creates a variable in the Global R variable space called 'ctl', and THEN, it turns the value of 'ctl' to the python variable 'X'.

#Because it is now defined in the R console instance, the variable 'ctl' can now be accessed by method I):
Y = r['ctl']
print X
print Y
list(X) == list(Y)

#Similarly,  the following executes a command string at the R console which defines a function in the global r space, AND THEN returns this function to the pythong variable 'F':
F = r('''f <- function(r) { 2 * pi * r }''')
F(3)
#This function f is accessible as a global R variable (like 'ctl' was):
G = r['f']
G(3)

#you can string commands together just like in R, by separating them with a semicolon:
X = r("g <- function(r) { 10 * pi * r }; g(3) ")
#Notice that what is RETURNED to the python variable X is the result of the command after the last semicolon:
X[0]
#But the previous commands, e.g. the definition of the R function 'g',  was still done in the R console instance and is sitting around for you to access:
r['g']
print r['g'](22)

#as another short example, the following returns the logistic regression of X on Y, where X and Y 1-D numpy arrays.  This shows how to write R "formulas" via Rpy.  Just like all R-object classes, formulas are accessed natively via the "RObjecst" module in Rpy2.    
rX = RO.RVector(X)
rY = RO.RVector(Y)
fmla = RO.RFormula('y ~ x')
env = fmla.getenvironment(); env['x'] = rX ; env['y'] = rY
summary = r['summary']
glm = r['glm']
Sum = summary(glm(fmla,family = 'binomial'))
