[R] degrees of freedom

2014-01-23 Thread Iain Gallagher
Hello List

I have been asked to analyse some data for a colleague. The design consists of 
a two sets of animals 


First set of three - one leg is treated and the other is not under two 
different conditions (control  overload are the same animals - control leg is 
control (!) for treated leg; 


Second set of three - one leg is treated and the other is not under two 
different conditions (high_fat and high_fat_overload are the same animals with 
high_fat being control leg for high_fat_overload).

Ideally I'd like to find differences between the treatments.

bip - structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 
3L, 3L, 4L, 4L, 4L), .Label = c(control, overload, high_fat, 
high_fat_overload), class = factor), variable = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = BiP, class = factor), 
    animal = structure(c(1L, 3L, 5L, 1L, 3L, 5L, 2L, 4L, 6L, 
    2L, 4L, 6L), .Label = c(rat1_c, rat1_hf, rat2_c, rat2_hf, 
    rat3_c, rat3_hf), class = factor), value = c(404979.65625, 
    783511.8125, 677277.625, 1576900.375, 1460101.875, 1591022, 
    581313.75, 992724.1875, 1106941.5, 996600.375, 1101696.5, 
    1171004.375)), .Names = c(group, variable, animal, 
value), row.names = c(NA, 12L), class = data.frame)

I chose to analyse this as a mixed effects model with treatment as a fixed 
effect and animal as random.

library(lme4)
model1 - lmer(value~group + (1|animal), data=bip)
summary(model1)

And then compare this to no treatment with:

anova(model1)

From this I wanted to work out whether 'treatment' was significantly affecting 
BiP levels by calculating the critical value of F for this design. I have 2 
groups of animals and 3 animals per group. My calculation for the degrees of 
freedom for treatment is 4-1=3.

I'm not sure about the degrees of freedom for the denominator though. Since I'm 
comparing a model with treatment to one without (i.e. the grand mean) would the 
df for my denominator be 6-1=5?

So I'd then have:

qf(0.95,3,5)

for my critical F value?

Best

iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] heatmap tile size question

2013-02-01 Thread Iain Gallagher
Hello List

I was wondering if it is possible to make the individual 'tiles' in a heatmap 
larger. Often when I plot heatmaps and want to label the rows with eg gene 
names I either have to shrink the text or leave it out altogether as it becomes 
so small as to be unreadable. I wondered if there was a way to make the 'tiles' 
deeper and therefore allow more room for the row labels.

I apologise if this is not entirely clear but the toy code below might 
illustrate the problem.

library(hgu133plus2.db) 
# fake up some data
testMat - matrix(rnorm(4500, 4, 2), 150, 30)
testMat[,15:30] - testMat[,15:30]*4 # clear differnce

geneNames - as.list(hgu133plus2SYMBOL) # get some gene names
rownames(testMat) - stack(sample(geneNames, 150))[,1] # relabel the matrix

heatmap(testMat) # row labels overlap

Best

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] write out list of lists with names

2012-11-28 Thread Iain Gallagher


Hello List

I have a list question. I'm doing some data wrangling for a colleague and I 
have nested list in the following format:

structure(list(MU10 = structure(c(0.80527905920989, 0.4350488707836, 
0.455195366623, 0.565174432205497, 0.208180556861924), .Names = c(MU.16, 
MU.19, MU.21, mean, sd)), MU11 = structure(c(0.56061565798878, 
0.65200918021661, 0.606312419102695, 0.0646249793238221), .Names = c(MU.21, 
MU.22, mean, sd)), MU12 = structure(c(0.77265115449472, 
0.3925776107826, 0.38222435807226, 0.515817707783193, 0.222484520748552
), .Names = c(MU.14, MU.20, MU.23, mean, sd)), MU13 = 
structure(c(0.36360576458114, 
0.21396125968483, 0.288783512132985, 0.105814644179484), .Names = c(MU.20, 
MU.22, mean, sd)), MU14 = structure(c(0.31692017862428, 
0.31692017862428, NA), .Names = c(MU.18, mean, sd)), MU15 = 
structure(c(0.57645424339545, 
0.82369227173036, 0.700073257562905, 0.174823686402807), .Names = c(MU.18, 
MU.22, mean, sd))), .Names = c(MU10, MU11, MU12, 
MU13, MU14, MU15))

I would like to write this to a text file in the form e.g. (each x is a value):

MU10
MU.16  MU.19  MU.21  mean  sd
x  x  x  x  x  
MU11
MU.21  MU.22 mean sd
x  x  x  x

Where each list element is on a new block of three rows.

After consulting Google I came across the following:

fnlist - function(x, fil){ 
z - deparse(substitute(x))
cat(z, \n, file = fil)
nams - names(x)

for (i in seq_along(x) ){ cat(nams[i], \n, x[[i]], \n,file = fil, append = 
TRUE) }
}

fnlist(holdList, 'res.txt')

However this doesn't print the names within each sub list.

Can anyone advise?

Thanks

Iain


R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_GB.UTF-8    LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C    
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] plyr_1.7.1    gsubfn_0.6-5  proto_0.3-9.2

loaded via a namespace (and not attached):
[1] tcltk_2.15.1 tools_2.15.1





Iain


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] conditional subset and reorder dataframe rows

2012-07-20 Thread Iain Gallagher
Hi List

I have a dataframe (~1,200,000 rows deep) and I'd like to conditionally reorder 
groups of rows in this dataframe. 

I would like to reorder any rows where the Chr.Strand column contains a '-' but 
reorder within subsets delineated by the Probe.Set.Name column.

# toy example 

library(plyr)

negStrandGene - data.frame(Probe.Set.Name = rep('ENSMUSG0022174_at', 6), 
Chr = rep(14,6), Chr.Strand = rep('-', 6), Chr.From = c(54873546, 54873539, 
54873533, 54873529, 54873527, 54873416), Probe.X = 
c(388,1634,2141,2305,882,960), Probe.Y = c(2112, 1773, 1045, 862, 971, 2160))

posStrandGene - data.frame(Probe.Set.Name = rep('ENSMUSG0047459_at', 6), 
Chr = rep(2, 6), Chr.Strand = rep('+', 6), Chr.From = c(155062277, 155062304, 
155062305, 155062309, 155062326, 155062531), Probe.X = c(428, 1681, 2058, 1570, 
1293, 2125), Probe.Y = c(1484, 2090, 893, 1082, 1435, 1008))

mapping - rbind (negStrandGene, posStrandGene)

# define a function to do what we want
revSort - function(df){

  if (unique(df$Chr.Strand == '-')) 
  return (df[order(df$Chr.From), ])
  else return (df)

}


# split the data with plyr, apply the function and recombine
test2 - ddply(mapping, .(Probe.Set.Name), function(df) revSort(df)) # ok, cool 
works

So here the rows with the '-' if Chr.Strand are reordered whilst those with '+' 
are not.

My initial attempt using plyr is very inefficient and I wondered if someone 
could suggest something better.

Best

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading file in zip archive

2012-05-31 Thread Iain Gallagher
Hi Phil

That's it. Thanks.

Will have a read at the docs now and see if I can figure out why leaving the 
'r'ead instruction out works. Seems counter-intuitive!

Best

Iain




 From: Phil Spector spec...@stat.berkeley.edu
To: Iain Gallagher iaingallag...@btopenworld.com 
Cc: r-help r-help@r-project.org 
Sent: Thursday, 31 May 2012, 0:06
Subject: Re: [R] reading file in zip archive

Iain -
   Do you see the same behaviour if you use

z - unz(pathToZip, 'x.txt')

instead of

z - unz(pathToZip, 'x.txt','r')

                    - Phil Spector
                     Statistical Computing Facility
                     Department of Statistics
                     UC Berkeley
                    spec...@stat.berkeley.edu


On Wed, 30 May 2012, Iain Gallagher wrote:

 Hi Phil
 
 Thanks, but this still doesn't work.
 
 Here's a reproducible example (was wrapping my head around these functions 
 before).
 
 x - as.data.frame(cbind(rep('a',5), rep('b',5)))
 y - as.data.frame(cbind(rep('c',5), rep('d',5)))
 
 write.table(x, 'x.txt', sep='\t', quote=FALSE)
 write.table(y, 'y.txt', sep='\t', quote=FALSE)
 
 zip('test.zip', files = c('x.txt', 'y.txt'))
 
 pathToZip - paste(getwd(), '/test.zip', sep='')
 
 z - unz(pathToZip, 'x.txt', 'r')
 zT - read.table(z, header=FALSE, sep='\t')
 
 Error in read.table(z, header = FALSE, sep = \t) :
   seek not enabled for this connection
 
 As I said in my previous email readLines fails as well. Rather strange really.
 
 Anyway, as before any advice would be appreciated.
 
 Best
 
 Iain
 
 _
 From: Phil Spector spec...@stat.berkeley.edu
 To: Iain Gallagher iaingallag...@btopenworld.com
 Cc: r-help r-help@r-project.org
 Sent: Wednesday, 30 May 2012, 20:16
 Subject: Re: [R] reading file in zip archive
 
 Iain -
     Once you specify the file to unzip in the call to unz, there's no
 need to repeat the filename in read.table.  Try:
 
 z - unz(pathToZip, 'goCats.txt', 'r')
 zT - read.table(z, header=TRUE, sep='\t')
 
 (Although I can't reproduce the exact error which you saw.)
 
                     - Phil Spector
                     Statistical Computing Facility
                     Department of Statistics
                     UC Berkeley
                     spec...@stat.berkeley.edu
 
 
 
 On Wed, 30 May 2012, Iain Gallagher wrote:
 
  Hi List
 
  I have a series of zip archives each containing several files. One of these 
  files is called
 goCats.txt and I would like to read it into R from the archive. It's a simple 
 tab delimited text
 file.
  pathToZip 
  -'/home/iain/Documents/Work/Results/bovineMacRNAData/deAnalysis/afInfection/commonNorm/twoHrs/af2
 hrs.zip'
 
  z - unz(pathToZip, 'goCats.txt', 'r')
  zT - read.table(z, 'goCats.txt', header=T, sep='\t')
 
  Error in read.table(z, goCats.txt, header = T, sep = \t) :
  ? seek not enabled for this connection
 
 
  The same error arises with readLines.
 
  Can anyone advise?
 
  Best
 
  iain
 
  sessionInfo()
  R version 2.15.0 (2012-03-30)
  Platform: x86_64-pc-linux-gnu (64-bit)
 
  locale:
  ?[1] LC_CTYPE=en_GB.utf8?? LC_NUMERIC=C
  ?[3] LC_TIME=en_GB.utf8??? LC_COLLATE=en_GB.utf8???
  ?[5] LC_MONETARY=en_GB.utf8??? LC_MESSAGES=en_GB.utf8??
  ?[7] LC_PAPER=C??? LC_NAME=C???
  ?[9] LC_ADDRESS=C? LC_TELEPHONE=C??
  [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C?
 
  attached base packages:
  [1] stats graphics? grDevices utils datasets? methods?? base
 
  loaded via a namespace (and not attached):
  [1] tools_2.15.0
      [[alternative HTML version deleted]]
 
 
 
 
 
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading file in zip archive

2012-05-30 Thread Iain Gallagher
Hi List

I have a series of zip archives each containing several files. One of these 
files is called goCats.txt and I would like to read it into R from the archive. 
It's a simple tab delimited text file.
pathToZip - 
'/home/iain/Documents/Work/Results/bovineMacRNAData/deAnalysis/afInfection/commonNorm/twoHrs/af2hrs.zip'

z - unz(pathToZip, 'goCats.txt', 'r')
zT - read.table(z, 'goCats.txt', header=T, sep='\t')

Error in read.table(z, goCats.txt, header = T, sep = \t) : 
  seek not enabled for this connection


The same error arises with readLines.

Can anyone advise?

Best

iain

 sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C 
 [3] LC_TIME=en_GB.utf8    LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=en_GB.utf8    LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=C    LC_NAME=C    
 [9] LC_ADDRESS=C  LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C  

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] tools_2.15.0
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading file in zip archive

2012-05-30 Thread Iain Gallagher
Hi Phil

Thanks, but this still doesn't work. 


Here's a reproducible example (was wrapping my head around these functions 
before).

x - as.data.frame(cbind(rep('a',5), rep('b',5)))
y - as.data.frame(cbind(rep('c',5), rep('d',5)))

write.table(x, 'x.txt', sep='\t', quote=FALSE)
write.table(y, 'y.txt', sep='\t', quote=FALSE)

zip('test.zip', files = c('x.txt', 'y.txt'))

pathToZip - paste(getwd(), '/test.zip', sep='')

z - unz(pathToZip, 'x.txt', 'r')
zT - read.table(z, header=FALSE, sep='\t')

Error in read.table(z, header = FALSE, sep = \t) : 
  seek not enabled for this connection


As I said in my previous email readLines fails as well. Rather strange really.

Anyway, as before any advice would be appreciated.

Best

Iain





 From: Phil Spector spec...@stat.berkeley.edu
To: Iain Gallagher iaingallag...@btopenworld.com 
Cc: r-help r-help@r-project.org 
Sent: Wednesday, 30 May 2012, 20:16
Subject: Re: [R] reading file in zip archive

Iain -
    Once you specify the file to unzip in the call to unz, there's no
need to repeat the filename in read.table.  Try:

z - unz(pathToZip, 'goCats.txt', 'r')
zT - read.table(z, header=TRUE, sep='\t')

(Although I can't reproduce the exact error which you saw.)

                    - Phil Spector
                     Statistical Computing Facility
                     Department of Statistics
                     UC Berkeley
                    spec...@stat.berkeley.edu



On Wed, 30 May 2012, Iain Gallagher wrote:

 Hi List

 I have a series of zip archives each containing several files. One of these 
 files is called goCats.txt and I would like to read it into R from the 
 archive. It's a simple tab delimited text file.
 pathToZip - 
 '/home/iain/Documents/Work/Results/bovineMacRNAData/deAnalysis/afInfection/commonNorm/twoHrs/af2hrs.zip'

 z - unz(pathToZip, 'goCats.txt', 'r')
 zT - read.table(z, 'goCats.txt', header=T, sep='\t')

 Error in read.table(z, goCats.txt, header = T, sep = \t) :
 ? seek not enabled for this connection


 The same error arises with readLines.

 Can anyone advise?

 Best

 iain

 sessionInfo()
 R version 2.15.0 (2012-03-30)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
 ?[1] LC_CTYPE=en_GB.utf8?? LC_NUMERIC=C
 ?[3] LC_TIME=en_GB.utf8??? LC_COLLATE=en_GB.utf8???
 ?[5] LC_MONETARY=en_GB.utf8??? LC_MESSAGES=en_GB.utf8??
 ?[7] LC_PAPER=C??? LC_NAME=C???
 ?[9] LC_ADDRESS=C? LC_TELEPHONE=C??
 [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C?

 attached base packages:
 [1] stats graphics? grDevices utils datasets? methods?? base

 loaded via a namespace (and not attached):
 [1] tools_2.15.0
     [[alternative HTML version deleted]]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Completely Off Topic:Link to IOM report on use of -omics tests in clinical trials

2012-03-26 Thread Iain Gallagher
I followed this case while it was ongoing. 


It was a very interesting example of basic mistakes but also (for me) of 
journal politicking. 


Keith Baggerly and Kevin Coombes wrote a great paper - DERIVING 
CHEMOSENSITIVITY FROM CELL LINES: FORENSIC BIOINFORMATICS AND REPRODUCIBLE 
RESEARCH IN HIGH-THROUGHPUT BIOLOGY in The Annals of Applied Statistics (2009, 
Vol. 3, No. 4, 1309–1334) which explains some of the background and 
investigative work they had to do to bring those mistakes to light.


Best

iain



- Original Message -
From: Bert Gunter gunter.ber...@gene.com
To: r-help@r-project.org
Cc: 
Sent: Monday, 26 March 2012, 19:12
Subject: [R] Completely Off Topic:Link to IOM report on use of -omics tests 
in clinical trials

Warning: This has little directly to do with R, although R and related
tools (e.g. sweave and other reproducible research tools) have a
natural role to play.

The IOM report:

http://www.iom.edu/Reports/2012/Evolution-of-Translational-Omics.aspx

that arose out of the Duke Univ. genomics testing scandal has been
released. My thanks to Keith Baggerly for forwarding this. I believe
that many R users in the medical research community will find this
interesting, and I hope I do not venture too far out of line by
passing on the link to readers of this list. It **will** have an
important impact on so-called Personalized Health Care (which I guess
affects all of us), and open source analytical (statistical)
methodology is a central issue.

For those interested, try the summary first.

Best to all,
Bert


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hgu133plus2hsentrezgprobe library

2012-03-19 Thread Iain Gallagher
Hi Eleni

Question like this are better served on the bioconductor mailing list.

Nonetheless try this

ALL - topTable(fit2, coef=1, number=Inf)
ALL$SYMBOL - unlist(mget(ALL$ID, hgu133plus2hsentrezgSYMBOL, ifnotfound=NA))

Here ALL is the output from limma for differential expression (ALL$ID is the 
probe on ENTREZ centric cdf from brainarray).

Best

Iain



- Original Message -
From: Eleni Christodoulou elenic...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, 19 March 2012, 18:47
Subject: [R] hgu133plus2hsentrezgprobe library

Hello R community,

I am processing raw Affymetrix CEL files and I am using the Michigan custom
CDF library hgu133plus2hsentrezgprobe. I have been looking for
documentation on the function that it contains...I am specifically
interested in converting probe names to gene symbols. Does anybody know
where I can find it?

Thank a lot!
Eleni

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 reorder factors for faceting

2011-11-08 Thread Iain Gallagher


Dear List

I am trying to draw a heatmap using ggplot2. In this heatmap I have faceted my 
data by 'infection' of which I have four. These four infections break down into 
two types and I would like to reorder the 'infection' column of my data to 
reflect this. 

Toy example below:

library(ggplot2)

# test data for ggplot reordering
genes - (rep (c(rep('a',4), rep('b',4), rep('c',4), rep('d',4), rep('e',4), 
rep('f',4)) ,4))
fcData - rnorm(96)
times - rep(rep(c(2,6,24,48),6),4)
infection - c(rep('InfA', 24), rep('InfB', 24), rep('InfC', 24), rep('InfD', 
24))
infType - c(rep('M', 24), rep('D',24), rep('M', 24), rep('D', 24))

# data is long format for ggplot2
plotData - as.data.frame(cbind(genes, as.numeric(fcData), as.numeric(times), 
infection, infType))

hp2 - ggplot(plotData, aes(factor(times), genes)) + geom_tile(aes(fill = 
scale(as.numeric(fcData + facet_wrap(~infection, ncol=4)

# set scale
hp2 - hp2 + scale_fill_gradient2(name=NULL, low=#0571B0, mid=#F7F7F7, 
high=#CA0020, midpoint=0, breaks=NULL, labels=NULL, limits=NULL, 
trans=identity) 

# set up text (size, colour etc etc)
hp2 - hp2 + labs(x = Time, y = ) + scale_y_discrete(expand = c(0, 0)) + 
opts(axis.ticks = theme_blank(), axis.text.x = theme_text(size = 10, angle = 
360, hjust = 0, colour = grey25), axis.text.y = theme_text(size=10, colour = 
'gray25'))

hp2 - hp2 + theme_bw()

In the resulting plot I would like infections infA and infC plotted next to 
each other and likewise for infB and infD. I have a column in the data - 
infType - which I could use to reorder the infection column but so far I have 
no luck getting this to work.

Could someone give me a pointer to the best way to reorder the infection factor 
and accompanying data into the order I would like?

Best

iain

 sessionInfo()
R version 2.13.2 (2011-09-30)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C 
 [3] LC_TIME=en_GB.utf8    LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=en_GB.utf8   LC_NAME=C    
 [9] LC_ADDRESS=C  LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C  

attached base packages:
[1] grid  stats graphics  grDevices utils datasets  methods  
[8] base 

other attached packages:
[1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6 

loaded via a namespace (and not attached):
[1] digest_0.5.0 tools_2.13.2


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-Studio Question

2011-08-30 Thread Iain Gallagher
Above the graph on the left there are back and forward arrows. You can use 
those (like a browser).

This might have been better asked on the R-Studio forums. They're very friendly.

best

iain

--- On Tue, 30/8/11, Eran Eidinger e...@taykey.com wrote:

 From: Eran Eidinger e...@taykey.com
 Subject: [R] R-Studio Question
 To: r-help@r-project.org
 Date: Tuesday, 30 August, 2011, 8:59
 Hello,
 
 I've switched to R studio from the StatET Eclipse plug-in.
 I have a question regarding navigating between plots.
 When I use x11() or windows() new devices are created and I
 know how to
 switch back and forth between them.
 
 However, when I plot on the device that stands for
 R-Studio's built-in plot
 browser, is there a way to switch back between plots? Each
 new plot
 command opens a new plot, and the number of devices does
 not change.
 
 Thanks,
 Eran.
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Synchronizing R libraries on N machines?

2011-08-26 Thread Iain Gallagher
Hi Giovanni

Using Ubuntu and MacOSX may not be irrelevant. I use Ubuntu and if I carry out 
a fresh install (e.g. after a new release - although I've stuck with 10.04 so 
far) then I always have to mess around, check the web etc to install external 
packages that R libraries I want to use rely on. 

A good example would be libxml2-dev (which doesn't appear if you use Synaptic 
and search for xml2 - sigh) for using biomaRt etc etc.

Just a heads up that the external software some R libraries rely on might not 
be installed on both systems. So installing a package on one system sucessfully 
doesn't mean you can alway expect it to install without incident on another.

Best

iain

--- On Fri, 26/8/11, Giovanni Petris gpet...@uark.edu wrote:

 From: Giovanni Petris gpet...@uark.edu
 Subject: Re: [R] Synchronizing R libraries on N machines?
 To: Rainer M Krug r.m.k...@gmail.com
 Cc: r-help@r-project.org
 Date: Friday, 26 August, 2011, 14:05
 Hi Rainer,
 
 This certainly helps, but it still requires to do some work
 by hand. I
 was hoping for something more automatic - but so far nobody
 has
 suggested a better approach.
 
 Thank you,
 Giovanni
 
 
 On Thu, 2011-08-25 at 15:43 +0200, Rainer M Krug wrote:
  
  
  On Thu, Aug 25, 2011 at 3:25 PM, Giovanni Petris
 gpet...@uark.edu
  wrote:
          Hello!
          
          I am using R on
 two different machines (under Ubuntu and OS X,
          but this
          is probably
 irrelevant) and I would like to keep the two
          installations
          'synchronized',
 in particular in terms of installed packages.
          For
          example, if I
 install package xxx on my Linux machine, I would
          like to
          find it
 installed also on my Mac, and vice versa.
          
          I imagine this
 to be a fairly common problem, so I would like
          to ask if
          anybody has
 suggestions to share about it. Is there a way to
          make the
          synchronization
 automatic? Painless?
  
  
  library()$result[,1] returns the names of the
 installed packages. If
  you do this on one machine, then compare it with the
 same outpout on
  the other machine, you can identify the packages which
 are not
  installed, and you can install those.
  
  
  e.g.:
  x - letters[1:10]
  y - letters[5:20]
  lx[!(x %in% y)]
  
  
  returns 
  a b c d
  
  
  which are in x, but not y.
  
  
  Hope this helps,
  
  
  Rainer
  
  
  returns the letters which are in 
          
          Thank you in
 advance for the suggestions.
          
          Best,
          Giovanni
          
          
          --
          
          Giovanni
 Petris  gpet...@uark.edu
          Associate
 Professor
          Department of
 Mathematical Sciences
          University of
 Arkansas - Fayetteville, AR 72701
          Ph: (479)
 575-6324, 575-8630 (fax)
          http://definetti.uark.edu/~gpetris/
          
      
    __
          R-help@r-project.org
 mailing list
          https://stat.ethz.ch/mailman/listinfo/r-help
          PLEASE do read
 the posting guide
          http://www.R-project.org/posting-guide.html
          and provide
 commented, minimal, self-contained, reproducible
          code.
  
  
  
  
  -- 
  Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc
 (Conservation
  Biology, UCT), Dipl. Phys. (Germany)
  
  Centre of Excellence for Invasion Biology
  Stellenbosch University
  South Africa
  
  Tel :       +33 - (0)9 53 10
 27 44
  Cell:       +33 - (0)6 85 62
 59 98
  Fax (F):       +33 - (0)9 58
 10 27 44
  
  Fax (D):    +49 - (0)3 21 21 25 22 44
  
  email:      rai...@krugs.de
  
  Skype:      RMkrug
 
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregating data

2011-06-30 Thread Iain Gallagher
Hi Max

Using plyr instead of rehsape:

library(plyr)

df - data.frame(gene=c('A', 'A', 'E', 'A', 'F', 'F'), probe = c(1,2,3,4,5,6))

ddply(df, .(gene), function(df)length(df$gene))

  gene V1
1A  3
2E  1
3F  2

best

iain

--- On Thu, 30/6/11, Max Mariasegaram max.mariasega...@qut.edu.au wrote:

 From: Max Mariasegaram max.mariasega...@qut.edu.au
 Subject: [R] aggregating data
 To: r-help@r-project.org r-help@r-project.org
 Date: Thursday, 30 June, 2011, 8:28
 Hi,
 
 I am interested in using the cast function in R to perform
 some aggregation. I did once manage to get it working, but
 have now forgotten how I did this. So here is my dilemma. I
 have several thousands of probes (about 180,000)
 corresponding to each gene; what I'd like to do is obtain is
 a frequency count of the various occurrences of each probes
 for each gene.
 
 The data would look something like this:
 
 Gene     ProbeID     
          Expression_Level
 A         
    1           
   0.34
 A         
    2           
   0.21
 E              3 
             0.11
 A         
    4           
   0.21
 F              5 
             0.56
 F              6 
             0.87
 .
 .
 .
 (18 data points)
 
 In each case, the probeID is unique. The output I am
 looking for is something like this:
 
 Gene     No.ofprobes   
   Mean_expression
 A         
    3           
   0.25
 
 Is there an easy way to do this using cast or melt?
 Ideally, I would also like to see the unique probes
 corresponding to each gene in the wide format.
 
 Thanks in advance
 Max
 
 Maxy Mariasegaram| Reserach Fellow | Australian Prostate
 Cancer Research Centre| Level 1, Building 33 | Princess
 Alexandra Hospital | 199 Ipswich Road, Brisbane QLD 4102
 Australia | t: 07 3176 3073| f: 07 3176 7440 | e: maria...@qut.edu.au
 
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregating data

2011-06-30 Thread Iain Gallagher
oops last reply was only half the solution:

library(plyr)
df - data.frame(gene=c('A', 'A', 'E', 'A', 'F', 'F'), probe = c(1,2,3,4,5,6), 
exp = c(0.34, 0.21, 0.11, 0.21, 0.56, 0.81))

ddply(df, .(gene), function(df)c(length(df$gene), median(df$exp))

  gene V1V2
1A  3 0.210
2E  1 0.110
3F  2 0.685

best

iain

--- On Thu, 30/6/11, Max Mariasegaram max.mariasega...@qut.edu.au wrote:

 From: Max Mariasegaram max.mariasega...@qut.edu.au
 Subject: [R] aggregating data
 To: r-help@r-project.org r-help@r-project.org
 Date: Thursday, 30 June, 2011, 8:28
 Hi,
 
 I am interested in using the cast function in R to perform
 some aggregation. I did once manage to get it working, but
 have now forgotten how I did this. So here is my dilemma. I
 have several thousands of probes (about 180,000)
 corresponding to each gene; what I'd like to do is obtain is
 a frequency count of the various occurrences of each probes
 for each gene.
 
 The data would look something like this:
 
 Gene     ProbeID     
          Expression_Level
 A         
    1           
   0.34
 A         
    2           
   0.21
 E              3 
             0.11
 A         
    4           
   0.21
 F              5 
             0.56
 F              6 
             0.87
 .
 .
 .
 (18 data points)
 
 In each case, the probeID is unique. The output I am
 looking for is something like this:
 
 Gene     No.ofprobes   
   Mean_expression
 A         
    3           
   0.25
 
 Is there an easy way to do this using cast or melt?
 Ideally, I would also like to see the unique probes
 corresponding to each gene in the wide format.
 
 Thanks in advance
 Max
 
 Maxy Mariasegaram| Reserach Fellow | Australian Prostate
 Cancer Research Centre| Level 1, Building 33 | Princess
 Alexandra Hospital | 199 Ipswich Road, Brisbane QLD 4102
 Australia | t: 07 3176 3073| f: 07 3176 7440 | e: maria...@qut.edu.au
 
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] median time period

2011-06-29 Thread Iain Gallagher
Hello List

I'm trying to calculate the median period (in months) of a set of time 
intervals (between two interventions). 

I have been playing with the lubridate package to create the intervals but I 
can't think of the right approach to get the median timeperiod.

Toy code:

library(lubridate)
test - c('08-04-22', '08-07-28', '09-03-02', '09-03-03', '09-01-30', 
'09-03-09', '10-02-24', '10-03-05')
test - ymd(test)
intervals - as.period(test[5:8] - test[1:4])

intervals
[1] 9 months and 8 days7 months and 9 days11 months and 22 days 
[4] 1 year and 2 days 

How can I convert this 'period' object to months? From there I think I should 
just convert to 'numeric' and calculate the median.

Garrett if you're out there - great package but could you help please!?

Best

iain

 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C 
 [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=en_GB.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C  

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] lubridate_0.2.4

loaded via a namespace (and not attached):
[1] plyr_1.5.2  stringr_0.4


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median time period

2011-06-29 Thread Iain Gallagher
Typical - you post to the list and then work it out for yourself!

Anyway here's my solution

Toy code as before then:

intervalsMonths - 12 * intervals$year + intervals$month

#convert whole years to months then add the remaining months for that entry in 
intervals

medianMonths - median(as.numeric(intervalsMonths))

Best

iain

--- On Wed, 29/6/11, Iain Gallagher iaingallag...@btopenworld.com wrote:

 From: Iain Gallagher iaingallag...@btopenworld.com
 Subject: [R] median time period
 To: r-help@r-project.org
 Date: Wednesday, 29 June, 2011, 16:24
 Hello List
 
 I'm trying to calculate the median period (in months) of a
 set of time intervals (between two interventions). 
 
 I have been playing with the lubridate package to create
 the intervals but I can't think of the right approach to get
 the median timeperiod.
 
 Toy code:
 
 library(lubridate)
 test - c('08-04-22', '08-07-28', '09-03-02',
 '09-03-03', '09-01-30', '09-03-09', '10-02-24', '10-03-05')
 test - ymd(test)
 intervals - as.period(test[5:8] - test[1:4])
 
 intervals
 [1] 9 months and 8 days    7 months and 9
 days    11 months and 22 days 
 [4] 1 year and 2 days 
 
 How can I convert this 'period' object to months? From
 there I think I should just convert to 'numeric' and
 calculate the median.
 
 Garrett if you're out there - great package but could you
 help please!?
 
 Best
 
 iain
 
  sessionInfo()
 R version 2.13.0 (2011-04-13)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
  [1] LC_CTYPE=en_GB.utf8   
    LC_NUMERIC=C       
      
  [3] LC_TIME=en_GB.utf8       
 LC_COLLATE=en_GB.utf8    
  [5] LC_MONETARY=C         
    LC_MESSAGES=en_GB.utf8   
  [7] LC_PAPER=en_GB.utf8   
    LC_NAME=C       
         
  [9] LC_ADDRESS=C           
   LC_TELEPHONE=C       
    
 [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C 
     
 
 attached base packages:
 [1] stats     graphics  grDevices
 utils     datasets 
 methods   base     
 
 other attached packages:
 [1] lubridate_0.2.4
 
 loaded via a namespace (and not attached):
 [1] plyr_1.5.2  stringr_0.4
 
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why doesn't this work ?

2011-03-17 Thread Iain Gallagher
The first line of this reply is a definite candidate for the fortunes package!

best

i

--- On Thu, 17/3/11, bill.venab...@csiro.au bill.venab...@csiro.au wrote:

 From: bill.venab...@csiro.au bill.venab...@csiro.au
 Subject: Re: [R] Why doesn't this work ?
 To: ericst...@aol.com, r-help@r-project.org
 Date: Thursday, 17 March, 2011, 3:54
 It doesn't work (in R) because it is
 not written in R.  It's written in some other language
 that looks a bit like R.
 
  t - 3
  z - t %in% 1:3
  z
 [1] TRUE
  t - 4
  z - t %in% 1:3
  z
 [1] FALSE
  
 
  
 
 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org]
 On Behalf Of eric
 Sent: Thursday, 17 March 2011 1:26 PM
 To: r-help@r-project.org
 Subject: [R] Why doesn't this work ?
 
 Why doesn't this work and is there a better way ?
 
 z -ifelse(t==1 || 2 || 3, 1,0)
 t -3
 z
 [1] 1
 t -4
 z
 [1] 1
 
 trying to say ...if t == 1 or if t== 2 or if t ==3 then
 true, otherwise
 false
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Why-doesn-t-this-work-tp3383656p3383656.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Export R dataframes to excel

2011-03-01 Thread Iain Gallagher
This appeared today on the r-bloggers site and might be useful for you.

http://www.r-bloggers.com/release-of-xlconnect-0-1-3/

cheers

i

--- On Tue, 1/3/11, Steve Taylor steve.tay...@aut.ac.nz wrote:

 From: Steve Taylor steve.tay...@aut.ac.nz
 Subject: Re: [R] Export R dataframes to excel
 To: r-help@r-project.org, maxsilva mmsil...@uc.cl
 Date: Tuesday, 1 March, 2011, 20:15
 You can copy it with the following
 function and then paste into Excel...
  
 copy = function (df, buffer.kb=256) {
   write.table(df,
 file=paste(clipboard-,buffer.kb,sep=),
       sep=\t, na='', quote=FALSE,
 row.names=FALSE)
 }
 
 
  
 
 From: maxsilva mmsil...@uc.cl
 To:r-help@r-project.org
 Date: 2/Mar/2011 8:50a
 Subject: [R] Export R dataframes to excel
 
 I'm trying to do this in several ways but havent had any
 result. Im asked to
 install python, or perl etc. Can anybody suggest a
 direct, easy and
 understandable way?  Every help would be appreciated.
 
 
 Thx.
 
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/Export-R-dataframes-to-excel-tp3330399p3330399.html
 
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide http://www.R ( http://www.r/
 )-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple plots using a loop

2011-02-21 Thread Iain Gallagher
Hi Darcy

This works for me:

Factor - rep(factor(letters[1:4]), each = 10)
Size - runif(40) * 100

par(mfrow = c(2, 2))

for (i in unique(Factor)) {
hist(Size[Factor == i], main = i,
xlab = paste(n =,length(Size[Factor == i])), ylab = )
}
 
I think that using for (i in Factor) cycles through every occurrence of a level 
and so you only get four plots of the last level rather than a plot for every 
level.

cheers

iain

--- On Mon, 21/2/11, Darcy Webber darcy.web...@gmail.com wrote:

 From: Darcy Webber darcy.web...@gmail.com
 Subject: [R] multiple plots using a loop
 To: r-help@r-project.org
 Date: Monday, 21 February, 2011, 9:25
 Dear R users,
 
 I am trying to write myself a loop in order to produce a
 set of 20
 length frequency plots each pertaining to a factor level. I
 would like
 each of these plots to be available on the same figure, so
 I have used
 par(mfrow = c(4, 5)). However, when I run my loop below, it
 produces
 20 plots for each factor level and only displays the last
 factor
 levels LF plots. I'm fairly new to loops in R, so any help
 would be
 greatly appreciated.
 
 I have provided an example data set below if required with
 just 4
 factors and adjusted par settings accordingly.
 
 Factor - rep(factor(letters[1:4]), each = 10)
 Size - runif(40) * 100
 
 par(mfrow = c(2, 2))
 
 for (i in Factor) {
 LFchart - hist(Size[Factor == i], main = i,
 xlab = c(n =,length(Size[Factor == i])), ylab = )
 }
 
 P.S. Also just a quick annoying question. My xlab
 displays:
 n =
 120
 I would like it to display:
 n = 120
 but just cant get it to work. Any thoughts.
 
 Regar
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with function

2010-12-18 Thread Iain Gallagher
Ok - used browser to step through the function. Thanks for the nod towards that 
Chuck.

 tf2 - cumulMetric(tf1, deMirs$up)
Called from: cumulMetric(tf1, deMirs$up)
Browse[1] fcVector - as.numeric(with (deMirs, FC[match(deMirPresGenes[,4], 
Probe)] ) )

Browse[1] metric - fcVector * as.numeric(deMirPresGenes[,11])

Browse[1] geneMetric - cbind(deMirPresGenes[,2], metric) 

Browse[1] ls()
[1] deMirPresGenes deMirs fcVector   geneMetric
[5] metric   

Browse[1] listMetric - unstack(geneMetric, 
as.numeric(geneMetric[,2])~geneMetric[,1])
Error in eval(expr, envir, enclos) : object 'geneMetric' not found

Browse[1] ls()
[1] deMirPresGenes deMirs fcVector   geneMetric
[5] metric   

Browse[1] head(geneMetric)
 sym  metric
[1,] AAK1   -0.35505
[2,] ABCA1  -0.34979
[3,] ABCA2  -1.0329 
[4,] ABCB10 -1.22558
[5,] ABCE1  -0.61348
[6,] ABCF3  -0.86584


So geneMetric is there. It looks right but for some reason the call to unstack 
cannot find it. Yet if I go through this line by line but not as a function the 
call to unstack works fine:

 fcVector - as.numeric(with (deMirs$up, FC[match(tf1[,4], Probe)] ) )
 metric - fcVector * as.numeric(tf1[,11])
  geneMetric - cbind(tf1[,2], metric)
 head(geneMetric)
  metric
[1,] AAK1   -0.35505
[2,] ABCA1  -0.34979
[3,] ABCA2  -1.0329 
[4,] ABCB10 -1.22558
[5,] ABCE1  -0.61348
[6,] ABCF3  -0.86584

 colnames(geneMetric) - c('sym', 'metric')

 listMetric - unstack(geneMetric, as.numeric(geneMetric[,2])~geneMetric[,1])
 head(listMetric)
$AAK1
[1] -0.35505

$ABCA1
[1] -0.34979

$ABCA2
[1] -1.0329

$ABCB10
[1] -1.22558

$ABCE1
[1] -0.61348

$ABCF3
[1] -0.86584

Any further advice would be much appreciated.

Thanks

i

 sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C 
 [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=en_GB.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C  

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 
 


--- On Sat, 18/12/10, Charles C. Berry cbe...@tajo.ucsd.edu wrote:

 From: Charles C. Berry cbe...@tajo.ucsd.edu
 Subject: Re: [R] help with function
 To: Iain Gallagher iaingallag...@btopenworld.com
 Cc: r-help@r-project.org
 Date: Saturday, 18 December, 2010, 0:13
 On Fri, 17 Dec 2010, Iain Gallagher
 wrote:
 
  Hello List
 
  I'm moving this over from the bioC list as, although
 the problem I'm working on is biological, the current bottle
 neck is my poor understanding of R.
 
  I wonder if someone would help me with the following
 function.
 
 
 Here is how I'd take it apart.
 
 Either
 
   1) put browser() as the first line of the
 function,then feed lines to the
      browser one -by-one to see where
 it hangs,
 
   2) trace(cumulMetric) , then try to run it (much
 like 1, but it will
      handle feeding the lines of the
 function more easily)
 
   3) options( error = recover ), then run it till it
 hits the error, then
      browser thru the frames to see
 what is where
 
 See
 
      ?browser
      ?trace
      ?recover
 
 as background.
 
 HTH,
 
 Chuck
 
  cumulMetric - function(deMirPresGenes, deMirs){
     
  #need to match position of each miR in deMirPresGenes
 with its FC to form a vector of FC in correct order
      fc - deMirs
      fcVector - as.numeric(with (fc,
 FC[match(deMirPresGenes[,4], Probe)] ) )
 
      #multiply fc by context score for each
 interaction
      metric - fcVector *
 as.numeric(deMirPresGenes[,11])
      geneMetric - cbind(deMirPresGenes[,2],
 as.numeric(metric))
 
      #make cumul weighted score
      listMetric - unstack(geneMetric,
 as.numeric(geneMetric[,2])~geneMetric[,1])
      listMetric -
 as.data.frame(sapply(listMetric,sum)) #returns a dataframe
      colnames(listMetric) - c('cumulMetric')
 
      #return whole list
      return(listMetric)
  }
 
  deMirPresGenes looks like this:
 
  Gene.ID    Gene.Symbol    Species.ID    miRNA 
   Site.type    UTR_start    UTR_end   
 X3pairing_contr    local_AU_contr    position_contr   
 context_score    context_percentile
  22848    AAK1    9606    hsa-miR-183    2   
 1546    1552    -0.026    -0.047    0.099   
 -0.135    47
  19    ABCA1    9606    hsa-miR-183    2   
 1366    1372    -0.011    -0.048    0.087   
 -0.133    46
  20    ABCA2    9606    hsa-miR-495    2   
 666    672    -0.042    -0.092    -0.035   
 -0.33    93
  23456    ABCB10    9606    hsa-miR-183    3 
   1475    1481    0.003    -0.109    -0.05   
 -0.466    98
  6059    ABCE1    9606    hsa-miR-495    2   
 1474    1480    0.005    -0.046    0.006   
 -0.196    58
  55324    ABCF3    9606    hsa-miR-1275    3 
   90    96    0.007    0.042    -0.055   
 -0.316    94
 
  although it is much longer in 'real

Re: [R] [BioC] problem with function

2010-12-18 Thread Iain Gallagher
Hi Christian, Chuck (and lists)

It seems that the problem may be the strange behaviour of 'unstack' inside a 
function. 

See this thread in the R mailing list:

http://tolstoy.newcastle.edu.au/R/help/04/03/1160.html

Anyway, I got round the problem by using 'aggregate' instead of converting to a 
list and then tapply to sum values of metric. Probably more efficient as well.

Thanks for the help offered.

My function now looks like this (for the record!) and behaves as it should.

makeMetric - function(deMirPresGenes, deMirs){
    
#need to match position of each miR in deMirPresGenes with its FC to form a 
vector of FC in correct order
    
    fcVector - as.numeric(with (deMirs, FC[match(deMirPresGenes[,4], Probe)] ) 
)

    #multiply fc by context score for each interaction    
    metric - fcVector * as.numeric(deMirPresGenes[,11])

    geneMetric - cbind(deMirPresGenes[,2], metric)
    colnames(geneMetric) - c('sym', 'metric')
    


    #make cumul by aggregate
    listMetric - aggregate(as.numeric(geneMetric[,2]), list(geneMetric[,1]), 
sum)#returns a dataframe
    colnames(listMetric) - c('symbol','cumulMetric')
    
    #return whole list
    return(listMetric)# dataframe
}

Cheers

i

--- On Sat, 18/12/10, cstrato cstr...@aon.at wrote:

 From: cstrato cstr...@aon.at
 Subject: Re: [BioC] problem with function
 To: Iain Gallagher iaingallag...@btopenworld.com
 Cc: bioconductor bioconduc...@stat.math.ethz.ch
 Date: Saturday, 18 December, 2010, 14:40
 You need to do:
 
 cumulMetric - function(deMirPresGenes, deMirs){
     fc - deMirs
     fcVector - as.numeric(with (fc,
 FC[match(deMirPresGenes[,4], Probe)] ) )
 
     metric - fcVector *
 as.numeric(deMirPresGenes[,11])
     geneMetric -
 as.data.frame(cbind(deMirPresGenes[,2],
 as.numeric(metric)))
     colnames(geneMetric) - c('y', 'x')
 
     listMetric - unstack(geneMetric, x ~
 y)
     listMetric -
 as.data.frame(sapply(listMetric,sum)) #returns a dataframe
     colnames(listMetric) -
 c('cumulMetric')
 
     return(listMetric)
 }
 
 Regards
 Christian
 
 On 12/17/10 11:52 PM, Iain Gallagher wrote:
  ok... done. Not really any further forward here.
 
  print statements after creating fcVector, metric and
 geneMetric (see output below). They all look ok in terms of
 structure and length. But the error persists and listMetric
 is not made?!?! Odd.
 
  I have added some comments to the output below.
 
  tf2-cumulMetric(tf1, deMirs$up)#deMirs$up is a
 dataframe (see prev posts)
  [1] 2.63 2.63 3.13 2.63 3.13 2.74 # print fcVector -
 looks ok
  [1] -0.35505 -0.34979 -1.03290 -1.22558 -0.61348
 -0.86584 # print metric - looks ok
  [1] 1045 # lengthof metric - is correct
        sym     
 metric    # print geneMetric - looks ok
  [1,] AAK1   -0.35505
  [2,] ABCA1  -0.34979
  [3,] ABCA2  -1.0329
  [4,] ABCB10 -1.22558
  [5,] ABCE1  -0.61348
  [6,] ABCF3  -0.86584
  [1] 1045 # nrow of geneMetric - is correct
  Error in eval(expr, envir, enclos) : object
 'geneMetric' not found
 
 
  cheers
 
  i
  --- On Fri, 17/12/10, cstratocstr...@aon.at 
 wrote:
 
  From: cstratocstr...@aon.at
  Subject: Re: [BioC] problem with function
  To: Iain Gallagheriaingallag...@btopenworld.com
  Cc: bioconductorbioconduc...@stat.math.ethz.ch
  Date: Friday, 17 December, 2010, 22:38
  At the moment I have no idea, but
  what I would do in this case is to put
  print() statements after each line to see where it
 fails.
 
  Christian
 
  On 12/17/10 10:59 PM, Iain Gallagher wrote:
  Hi
 
  FC is the second column of the deMirs
 variable. deMirs
  is a dataframe with 2 columns - Probe (e.g.
 hsa-miR-145) and
  FC (e.g 1.45). Using 'with' allows me to use
 deMirs as an
  'environment'. I thus don't have to pass FC
 explicitly.
 
  Cheers
 
  i
 
  --- On Fri, 17/12/10, cstratocstr...@aon.at
  wrote:
 
  From: cstratocstr...@aon.at
  Subject: Re: [BioC] problem with function
  To: Iain Gallagheriaingallag...@btopenworld.com
  Cc: bioconductorbioconduc...@stat.math.ethz.ch
  Date: Friday, 17 December, 2010, 20:39
  What is FC[]?  It is not passed
  to the function. Christan
 
  On 12/17/10 8:11 PM, Iain Gallagher
 wrote:
  Sorry.
 
  That was a typo. In my script
  deMirPresGenes1[,4] is
  deMirPresGenes[,4].
 
  Just to be sure I'm going about this
 the right
  way
  though I should say that at the moment I
 assign
  the output
  of another function to a variable called
 'tf1' -
  this object
  is the same as the deMirPresGenes is my
 previous
  email.
 
  This is then fed to my problem
 function using
  positional matching.
 
  e.g. tf2- cumulMetric(tf1,
 deMirs)
 
  Which leads to:
 
  Error in eval(expr, envir, enclos) :
 object
  'geneMetric' not found
 
  Hey ho!
 
  i
 
  --- On Fri, 17/12/10, cstratocstr...@aon.at
  wrote:
 
  From: cstratocstr...@aon.at
  Subject: Re: [BioC] problem with
 function
  To: Iain Gallagheriaingallag...@btopenworld.com
  Cc: bioconductorbioconduc...@stat.math.ethz.ch
  Date: Friday, 17 December, 2010,
 18:40
  I am not sure but I would say

[R] help with function

2010-12-17 Thread Iain Gallagher
Hello List

I'm moving this over from the bioC list as, although the problem I'm working on 
is biological, the current bottle neck is my poor understanding of R. 

I wonder if someone would help me with the following function.

cumulMetric - function(deMirPresGenes, deMirs){
   
#need to match position of each miR in deMirPresGenes with its FC to form a 
vector of FC in correct order
    fc - deMirs
    fcVector - as.numeric(with (fc, FC[match(deMirPresGenes[,4], Probe)] ) )

    #multiply fc by context score for each interaction
    metric - fcVector * as.numeric(deMirPresGenes[,11])
    geneMetric - cbind(deMirPresGenes[,2], as.numeric(metric))

    #make cumul weighted score
    listMetric - unstack(geneMetric, as.numeric(geneMetric[,2])~geneMetric[,1])
    listMetric - as.data.frame(sapply(listMetric,sum)) #returns a dataframe
    colnames(listMetric) - c('cumulMetric')

    #return whole list
    return(listMetric)
}

deMirPresGenes looks like this:

Gene.ID    Gene.Symbol    Species.ID    miRNA    Site.type    UTR_start    
UTR_end    X3pairing_contr    local_AU_contr    position_contr    
context_score    context_percentile
22848    AAK1    9606    hsa-miR-183    2    1546    1552    -0.026    -0.047   
 0.099    -0.135    47
19    ABCA1    9606    hsa-miR-183    2    1366    1372    -0.011    -0.048    
0.087    -0.133    46
20    ABCA2    9606    hsa-miR-495    2    666    672    -0.042    -0.092    
-0.035    -0.33    93
23456    ABCB10    9606    hsa-miR-183    3    1475    1481    0.003    -0.109  
  -0.05    -0.466    98
6059    ABCE1    9606    hsa-miR-495    2    1474    1480    0.005    -0.046    
0.006    -0.196    58
55324    ABCF3    9606    hsa-miR-1275    3    90    96    0.007    0.042    
-0.055    -0.316    94

although it is much longer in 'real life'.

The aim of the function is to extract a dataframe of gene symbols along with a 
weighted score from the above data. The weighted score is the FC column of 
deMirs * the context_score column of deMirPresGenes. This is easy peasy!

Where I'm falling down is that if I run this function it complains that 
'geneMetric' can't be found. Hmm - I've run it all line by line (i.e. not as a 
function) and it works but wrapped up like this it fails!

e.g.

testF2 - cumulMetric(testF1, deMirs$up)
Error in eval(expr, envir, enclos) : object 'geneMetric' not found

deMirs$up looks like this:

Probe    FC
hsa-miR-183    2.63
hsa-miR-1275    2.74
hsa-miR-495    3.13
hsa-miR-886-3p    3.73
hsa-miR-886-5p    3.97
hsa-miR-144*    6.62
hsa-miR-451    7.94

In an effort to debug this I have examined each object using 'print' statements 
(as suggested by cstrato on the bioC list).

All the objects in the function up until listMetric look ok in terms of 
structure and length. But the error persists and listMetric is not made?!?! Odd.

I have added some comments to the output below.

 tf2-cumulMetric(tf1, deMirs$up)#deMirs$up is a dataframe (see above - it is 
 the same as deMirs)


[1] 2.63 2.63 3.13 2.63 3.13 2.74 # print fcVector - looks ok
[1] -0.35505 -0.34979 -1.03290 -1.22558 -0.61348 -0.86584 # print metric - 
looks ok
[1] 1045 # length of metric - is correct
     sym      metric    # print geneMetric - looks ok
[1,] AAK1   -0.35505
[2,] ABCA1  -0.34979
[3,] ABCA2  -1.0329
[4,] ABCB10 -1.22558
[5,] ABCE1  -0.61348
[6,] ABCF3  -0.86584
[1] 1045 # nrow of geneMetric - is correct
Error in eval(expr, envir, enclos) : object 'geneMetric' not found


Could someone possibly point out where I falling down.

Thanks

iain

 sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C             
[3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8   
[5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8   
[7] LC_PAPER=en_GB.utf8       LC_NAME=C               
[9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C     

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_2.12.0




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where do I send typos?

2010-03-26 Thread Iain Gallagher
Best to send any comments on documentation to the author of the documentation.

iain

--- On Fri, 26/3/10, Xu Wang xuwang...@gmail.com wrote:

 From: Xu Wang xuwang...@gmail.com
 Subject: [R] where do I send typos?
 To: r-help@r-project.org
 Date: Friday, 26 March, 2010, 5:15
 
 Hi,
 
 I notice several typos when reading documentation and I
 verify that they are
 still typos in the current build. Where do I send
 corrections? Note that
 most of them are minor typos.
 
 Thanks
 -- 
 View this message in context: 
 http://n4.nabble.com/where-do-I-send-typos-tp1691713p1691713.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netlabR package in English

2010-03-26 Thread Iain Gallagher
Keith

Download the netLabR package from 
http://www.meduniwien.ac.at/user/georg.dorffner/netalg.html

i.e. download the .tar.gz or zip files. Inside these, in the inst/doc directory 
you'll find two English manuals.

HTH

iain

--- On Thu, 25/3/10, Keith McMillan keith.mcmil...@merrickbank.com wrote:

 From: Keith McMillan keith.mcmil...@merrickbank.com
 Subject: [R] netlabR package in English
 To: r-help@r-project.org
 Date: Thursday, 25 March, 2010, 21:24
 Dear R users,
 
  
 
 Is documentation for the netlabR package available in
 English?
 
  
 
 If not does anyone know if or when it will be?
 
  
 
 Regards,
 
  
 
 Keith
 
 This
 e-mail and any files transmitted with it are confidential
 and are intended solely for the use of the individual or
 entity to whom it is addressed. If you are not the intended
 recipient or the person responsible for delivering the
 e-mail to the intended recipient, be advised that you have
 received this e-mail in error, and that any use,
 dissemination, forwarding, printing, or copying of this
 e-mail is strictly prohibited. If you received this e-mail
 in error, please return the e-mail to the sender at Merrick
 Bank and delete it from your computer. Although Merrick Bank
 attempts to sweep e-mail and attachments for viruses, it
 does not guarantee that either are virus-free and accepts no
 liability for any damage sustained as a result of viruses.
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical operations with lists

2010-02-12 Thread Iain Gallagher
 a-list('a', 'b', 'c')
 b-list('c', 'd', 'e')
 c-intersect(a,b)
 c
[[1]]
[1] c

Is this what you want?

Cheers

Iain

--- On Fri, 12/2/10, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov 
wrote:

 From: Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov
 Subject: [R] logical operations with lists
 To: r-help@r-project.org r-help@r-project.org
 Date: Friday, 12 February, 2010, 22:06
 Sorry, maybe it's easy but I haven't
 found anything useful:
 
 how can I obtain a list C that contains all the members in
 the list B that are not in list A? This are lists of nanes,
 not numbers!
 
 Thank you
 
 
 Gabriele Zoppoli, MD
 Ph.D. Fellow, Experimental and Clinical Oncology and
 Hematology, University of Genova, Genova, Italy
 Guest Researcher, LMP, NCI, NIH, Bethesda MD
 
 Work: 301-451-8575
 Mobile: 301-204-5642
 Email: zoppo...@mail.nih.gov
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical operations with lists

2010-02-12 Thread Iain Gallagher
sorry, misread your post

try this...

 c-setdiff(a,b)
 c
[[1]]
[1] a

[[2]]
[1] b

for a list C that contains all the members in
the list B that are not in list A

--- On Fri, 12/2/10, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov 
wrote:

 From: Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov
 Subject: [R] logical operations with lists
 To: r-help@r-project.org r-help@r-project.org
 Date: Friday, 12 February, 2010, 22:06
 Sorry, maybe it's easy but I haven't
 found anything useful:
 
 how can I obtain a list C that contains all the members in
 the list B that are not in list A? This are lists of nanes,
 not numbers!
 
 Thank you
 
 
 Gabriele Zoppoli, MD
 Ph.D. Fellow, Experimental and Clinical Oncology and
 Hematology, University of Genova, Genova, Italy
 Guest Researcher, LMP, NCI, NIH, Bethesda MD
 
 Work: 301-451-8575
 Mobile: 301-204-5642
 Email: zoppo...@mail.nih.gov
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical operations with lists

2010-02-12 Thread Iain Gallagher
these are vectors in R not lists

try:

a-c('a', 'b', 'c')#first vector (like A)
 a
[1] a b c
b-c('c', 'd', 'e')#second vector (like B)
 b
[1] c d e
c-setdiff(b,a)# all those in B but not A
 c
[1] d e

cheers

iain


--- On Fri, 12/2/10, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov 
wrote:

 From: Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov
 Subject: Re: [R] logical operations with lists
 To: r-help@r-project.org r-help@r-project.org
 Date: Friday, 12 February, 2010, 22:57
 I'm sorry but here's what I get:
 
  A[1:10,]
  [1] UQCRC1 IDH3B  PDHA1  SUCLA2 COX5B 
 SDHB   SDHA   MDH2   DLD 
   COQ7 
 
  dim(A)
 [1] 1013    1
 
  B[1:10,]
  [1] 3.8-1.2 3.8-1.3 3.8-1.4 3.8-1.5 5-HT3c2 A1BG 
   A1CF   
 A2BP1   A2LD1   A2M   
 
  dim(B)
 [1] 55546     1
 
  C-rbind(A,B)
  dim(C)
 [1] 56559     1
 
  D - C[which(C %in% A ==FALSE)]
  dim(D)
 [1] 56559     0
 
 and so with any other proposed method.
 
 I imported the list A and B this way:
 
 
 A-as.vector(read.delim(E:/A.txt,sep=\t,header=FALSE))
 
 and then removed the redundant rows with:
 
  A-unique(A)
 
 Guess I'm doing something really wrong here... Sorry for
 the inexperience, I'm trying to improve...
 
 
 
 
 
 Gabriele Zoppoli, MD
 Ph.D. Fellow, Experimental and Clinical Oncology and
 Hematology, University of Genova, Genova, Italy
 Guest Researcher, LMP, NCI, NIH, Bethesda MD
 
 Work: 301-451-8575
 Mobile: 301-204-5642
 Email: zoppo...@mail.nih.gov
 
 From: jbreic...@gmail.com
 [jbreic...@gmail.com]
 On Behalf Of Jonathan [jonsle...@gmail.com]
 Sent: Friday, February 12, 2010 5:21 PM
 To: Zoppoli, Gabriele (NIH/NCI) [G]
 Cc: r-help@r-project.org
 Subject: Re: [R] logical operations with lists
 
 This is probably not the best way, but (assuming you had
 vectors and
 not lists, since I'm not sure what your list looks like):
 
 C - B[which(B %in% A ==FALSE)]
 
 Regards,
 Jonathan
 
 
 On Fri, Feb 12, 2010 at 5:06 PM, Zoppoli, Gabriele
 (NIH/NCI) [G]
 zoppo...@mail.nih.gov
 wrote:
  Sorry, maybe it's easy but I haven't found anything
 useful:
 
  how can I obtain a list C that contains all the
 members in the list B that are not in list A? This are lists
 of nanes, not numbers!
 
  Thank you
 
 
  Gabriele Zoppoli, MD
  Ph.D. Fellow, Experimental and Clinical Oncology and
 Hematology, University of Genova, Genova, Italy
  Guest Researcher, LMP, NCI, NIH, Bethesda MD
 
  Work: 301-451-8575
  Mobile: 301-204-5642
  Email: zoppo...@mail.nih.gov
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Upgrading To 2.10 from 2.6.2

2009-12-08 Thread Iain Gallagher
Hi Steve

Have you tried:

apt-cache search gfortran

in a terminal window.

Then

sudo apt-get install theRelevantPackage

I think you also need the Universe repos enabled.

HTH

Iain

--- On Tue, 8/12/09, steve_fried...@nps.gov steve_fried...@nps.gov wrote:

 From: steve_fried...@nps.gov steve_fried...@nps.gov
 Subject: [R] Upgrading To 2.10 from 2.6.2
 To: r-help@r-project.org
 Date: Tuesday, 8 December, 2009, 13:38
 
 Hello
 
 I have a Linux machine (Ubuntu 8.04 hardy, Gcc version
 4.2.4
 (i486-linux-gnu) currently running R 2.6.2. I'd like to
 upgrade to 2.10.
 
 First Question):  What is the appropriate way to
 remove the old version of
 R?
 
 
 Part 2.
  After downloading  r-base_2.10.0.orig.tar.gz and
 opening the archive. I
 ran the ./configure routine.
 
 It failed claiming that it could not find the F77
 compiler.
 
 My sys admin and I searched for a compiler, but have not
 succeeded in
 finding one compatible for Ubuntu 8.04.
 
 2nd Question:
 Would you be so kind and point me towards one?
 
 Thanks in advance
 Steve
 
 
 Steve Friedman Ph. D.
 Spatial Statistical Analyst
 Everglades and Dry Tortugas National Park
 950 N Krome Ave (3rd Floor)
 Homestead, Florida 33034
 
 steve_fried...@nps.gov
 Office (305) 224 - 4282
 Fax     (305) 224 - 4147
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] string splitting and testing for enrichment

2009-06-20 Thread Iain Gallagher
Hi List

I have data in the following form:

Gene    TFBS
NUDC     PPARA(1) HNF4(20) HNF4(96) AHRARNT(104) CACBINDINGPROTEIN(149) 
T3R(167) HLF(191) 
RPA2     STAT4(57) HEB(251) 
TAF12     PAX3(53) YY1(92) BRCA(99) GLI(101) 
EIF3I     NERF(10) P300(10) 
TRAPPC3     HIC1(3) PAX5(17) PAX5(110) NRF1(119) HIC1(122) 
TRAPPC3     EGR(26) ZNF219(27) SP3(32) EGR(32) NFKAPPAB65(89) NFKAPPAB(89) 
RFX(121) ZTA(168) 
NDUFS5     WHN(14) ATF(57) EGR3(59) PAX5(99) SF1(108) NRSE(146) 
TIE1     NRSE(129) 

I would like to test the 2nd column (each value has letters followed by numbers 
in brackets) here for enrichment via fisher.test.

To that end I am trying to create two factors made up of column 1 (Gene) and 
column 2 (TFBS) where each Gene would have several entries matching each TFBS.

My main problem just now is that I can't split the TFBS column into separate 
strings (at the moment that 2nd column is all one string for each Gene).

Here's where I am just now:

test-as.character(dataIn[,2]) # convert the 2nd column from factor to character
test2-unlist(strsplit(test[1], ' ')) # split the first element into individual 
strings (only the first element just now because I'm joust trying to get things 
working)
test3-unlist(strsplit(test2, '\\([0-9]\\)')) # get rid of numbers and brackets

now this does not behave as I hoped - it gives me:

 test3
[1] PPARA  HNF4(20)   HNF4(96)  
[4] AHRARNT(104)   CACBINDINGPROTEIN(149) T3R(167)  
[7] HLF(191)  

ie it only removes the numbers and brackets from the first entry and not the 
others.

Could someone point out my mistake please?

Once I have all the TFBS (letters only) for each Gene I would then count how 
often a TFBS occurs and use this data for a fisher.test testing for enrichment 
of TFBS in the list I have. I'm a rather muddled here though and would 
appreciate advice on whether this is the right approach.

Thanks

Iain

 sessionInfo()
R version 2.9.0 (2009-04-17) 
x86_64-pc-linux-gnu 

locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] arithmetic problem

2009-05-30 Thread Iain Gallagher

Hello list

I have a problem with a dataset (see toy example below) where I am trying to 
find the difference between two (or more numbers) and discard those 
observations which fall outside a set interval.

An example and further explanation:

   values  ind
12655  7A5
23028  7A5
3 689   ABBA-1
41336   ABBA-1
51560   ABBA-1
62820   ABLIM1
73339   ABLIM1
8 171ACSM5
9 195ACSM5
10 43 ADAMDEC1
11129 ADAMDEC1
12   1105 AFF1
13   3202 AFF1
14852 AFF3
15   2461 AFF3
16 45 AKT1
17397 AKT1
18   1430 AQP2
19   2402 AQP2
20   2551 ARHGAP19

Each number in the values column above is associated with a label (in the ind 
column). For some inds there will be only 2 values but as can be seen from the 
data other inds have many values.

Here's what I want to do using the ABBA-1 data from above as an example:

calculate the differences between each value:

1560-1336 = 224
1336-689 = 647

then use these values to create an index that will allow me to pull out values 
between set limits. If I set the limits to between 200 and 300 then the index 
will reference rows 4  5 in the above data set.

I hope this is reasonably clear and I appreciate any suggestions.

Thanks

Iain


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] counting occurrence of text in a dataframe

2009-05-23 Thread Iain Gallagher

Hello list.

I am hoping for some help with a relatively simple problem. I have a data frame 
arranged as below. I want to be able to count the occurrence of each gene (eg 
let-7e) by Experiment. In other words how many times does a given gene crop up 
in the dataframe. I tried table but couldn't work out how to get the output I 
want. I have also considered rearranging this data into a list (by gene) and 
then counting the length of each gene element. However I thought that there 
might be a more elegant solution.  

Tanaka Mitchell   Wang   Hunter Chen   Chim   
miR-191*   let-7e let-7b miR-126let-7a let-7g 
miR-198let-7f let-7c miR-146a   let-7b let-7i 
miR-22 let-7g miR-1224   miR-16 let-7d miR-130b   
miR-223let-7i miR-124miR-191let-7f miR-133a   
miR-296let7fG15A  miR-125a-3pmiR-222let-7g miR-140
miR-30dmiR-101miR-125b-5pmiR-223let-7i miR-142-5p 
miR-370miR-103miR-133a   miR-24 miR-101miR-146
miR-486miR-125a-5pmiR-133b   miR-26amiR-103miR-148a   
miR-498miR-126miR-135a*  miR-32 miR-106a   miR-152

Thanks for any advice.

Cheers

Iain

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replacing values in a vector

2008-11-06 Thread Iain Gallagher
Hello list.

I have a vector of values:

eg

 head(diff_mirs_list)
[1] hsa-miR-26b hsa-miR-26b hsa-miR-23a hsa-miR-27b hsa-miR-29a
[6] hsa-miR-29b

and I would like to conditionally replace each value in this vector with a 
number defined in a dataframe:

 fc
 Probe ave.fc
1   hsa-let-7a   1.28
2  hsa-miR-100   1.47
3  hsa-miR-125a-5p   1.31
4   hsa-miR-140-3p   1.28
5  hsa-miR-143   1.98
6  hsa-miR-193a-3p   1.37
7 hsa-miR-193b   1.48
8  hsa-miR-195   1.16
9  hsa-miR-214   1.22
10 hsa-miR-23a   1.21
11 hsa-miR-26b   1.13
12 hsa-miR-27b   1.37
13 hsa-miR-29a   1.24
14 hsa-miR-29b   1.69
15 hsa-miR-30b   1.16
16 hsa-miR-424   1.42
17  hsa-miR-768-3p   1.48
18  hsa-miR-886-3p   1.43
19 hsa-miR-933   1.23

ie every hsa-let-7a in the diff_mirs_list is replaced by 1.28, hsa-miR-100 by 
1.47 etc etc

I have tried to make a loop to use gsub eg

 for (i in 1:nrow(fc)){
+ test-gsub(fc[i,1], fc[i,2], diff_mirs_list)
}

but this obviously passes the unchanged vector to gsub each time and so I get 
back my 'test' vector with only hsa-miR-933 changed. Could someone help me out 
with this please. 

Thanks

Iain

 sessionInfo()
R version 2.8.0 (2008-10-20) 
i486-pc-linux-gnu 

locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Free SQL Database with R

2008-09-03 Thread Iain Gallagher
Hi Chibisi

I'm sort of jumping into this thread but from your description of charts / 
plotting below I thought you might like to take a look at flot:

http://code.google.com/p/flot/ 

From the page

Flot is a pure Javascript plotting library for jQuery. It produces graphical 
plots of arbitrary datasets on-the-fly client-side. The focus is on simple 
usage (all settings are optional), attractive looks and interactive features 
like zooming. 
Although Flot is easy to use, it is also advanced enough to be suitable for Web 
2.0 data mining/business intelligence purposes which is its original 
application. 
The plugin is targeting all newer browsers. If you find a problem, please 
report it. Drawing is done with the canvas tag introduced by Safari and now 
available on all major browsers, except Internet Explorer where the excanvas 
Javascript emulation helper is used.
Maybe it could be useful to you for plots etc.
Cheers
Iain




Chibisi Chima-Okereke [EMAIL PROTECTED] wrote: Dear Felix,

Thanks for the reply,

If you haven't already guessed I am new to web programming.

The sort of webpage I want to build is one that presents quantitative
information in graphs and charts, that people can interact with, e.g. select
parts of charts to zoom into, highlight values, click buttons to do analysis
on the data displayed, so yes some sort of interactive GUI. I initially
thought of using flash as a front end but I don't know any actionscript, so
learning that would to a suitable standard take alot of extra time, and I
think it would be best if everything could be done in R as much as possible.

If I used an RGUI I guess I would be using the playwith package? Do the
consumers of the website need to have R to consume stuff displayed with an
RGUI?

The database itself would just be pretty static just being queried for
information, unless some analysis was required in which case R would query
the database do the analysis and write the analysis back to the database (I
guessing that is the way it would be done), before it gets displayed on the
web page.

Kind Regards

Chibisi

On Wed, Sep 3, 2008 at 11:39 AM, drflxms  wrote:

 Hello Chibisi,

 I am not shore whether I completely understand your needs: Do you want
 to build a webpage which relies on a content management system (cms)? Do
 you want to collect data (i.e. survey) which later on shall be analysed
 using R? Or shall it be a webpage with an interactive R GUI? What else?

 But personally I would prefer MySQL as backend for websites, as most
 professional (opensource) cms (i.e. typo3, wordpress etc.) are created
 with MySQL in mind.
 There is a good interface between R and MySQL called RMySQL as well. I
 use this on a daily basis, as all my data is stored in a local MySQL
 database (more flexible than always reading in text files - at least in
 my opinion).

 Hope this personal view might help a little bit.

 Cheers,
 Felix

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with data layout

2008-07-17 Thread Iain Gallagher
Hello list

I have been given some Excel sheets with data laid like this:

Col1Col2
A 3
   2
   3
B 4
   5
   4
C 1
   4
   3

I was hoping to import this into R as a csv and then get the mean and SD for 
each letter in column 1.

Could someone give me some guidance on best to approach this?

Thanks

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with data layout - Thanks

2008-07-17 Thread Iain Gallagher
Thanks for all the excellent replies.

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Another failed attempt to install an R package in Ubuntu

2008-07-12 Thread IAIN GALLAGHER
Hi Miklos.

If you want to install R packages to the /usr/lib/R/library directory you need 
to start R with 'root' user privileges. Ordinary users do not have the 
appropriate permissions to write to this directory. At the command line type:

sudo R

type your password when prompted. 

You won't need to do this for using R or the packages though; you can start R 
with you normal user permissions ie in the usual way. There are many online 
tutorials for the linux permission system... have a google ;-)

Your error messages suggest you missing some header files (eg stdlib.h is the 
header file for the general purpose C library) used to build code. At the 
command line try this:

sudo apt-get install build-essentials

That will install many of the files you need. There are many apt-get primers on 
the web as well; it's worth having a look.

Hope this helps somewhat.

Cheers

Iain

Miklos Kiss [EMAIL PROTECTED] wrote: 
I am running Ubuntu 8.04 through Wubi on a HP Pavilion dv4000 (my home
computer) and I recently installed R 2.7.1.  I attempted to install the
randomForest package from within the R environment and from the Linux
terminal.  Both attempts failed.  I've tried several mirrors but with no
luck.  Below is an example of the series of error messages I got when R told
me that the installation wasn't going well.

It tells me that several header files are missing.  For some reason it is
also trying to install them in '/home/mik/R/i486-pc-linux-gnu-library/2.7'
but my R library packages are located in '/usr/lib/R/library'.  I tried
downloading various x11dev packages and such, as suggested in similar
threads in this forum, but that didn't seem to help.

I'm really new (installed Ubuntu less than 4 months ago) to Linux but I've
been using R for about a year.  If anyone can point me in the right
direction on how to correctly install this package, I'd very much appreciate
it.

Thank you,
Miklos Kiss

 install.packages()
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning in install.packages() :
  argument 'lib' is missing: using
'/home/mik/R/i486-pc-linux-gnu-library/2.7'
trying URL 'http://cran.mtu.edu/src/contrib/randomForest_4.5-25.tar.gz'
Content type 'application/x-gzip' length 70212 bytes (68 Kb)
opened URL
==
downloaded 68 Kb

* Installing *source* package 'randomForest' ...
** libs
gcc -std=gnu99 -I/usr/share/R/include  -fpic  -g -O2 -c classTree.c -o
classTree.o
In file included from classTree.c:15:
/usr/share/R/include/R.h:28:20: error: stdlib.h: No such file or directory
/usr/share/R/include/R.h:29:19: error: stdio.h: No such file or directory
In file included from
/usr/lib/gcc/i486-linux-gnu/4.2.3/include/syslimits.h:7,
 from /usr/lib/gcc/i486-linux-gnu/4.2.3/include/limits.h:11,
 from /usr/share/R/include/R.h:30,
 from classTree.c:15:
/usr/lib/gcc/i486-linux-gnu/4.2.3/include/limits.h:122:61: error: limits.h:
No such file or directory
In file included from classTree.c:15:
/usr/share/R/include/R.h:32:18: error: math.h: No such file or directory
/usr/share/R/include/R.h:33:19: error: errno.h: No such file or directory
In file included from /usr/share/R/include/R.h:50,
 from classTree.c:15:
/usr/share/R/include/R_ext/RS.h:24:39: error: string.h: No such file or
directory
classTree.c: In function ‘catmax_’:
classTree.c:305: warning: implicit declaration of function ‘pow’
classTree.c:305: warning: incompatible implicit declaration of built-in
function ‘pow’
make: *** [classTree.o] Error 1
ERROR: compilation failed for package 'randomForest'
** Removing '/home/mik/R/i486-pc-linux-gnu-library/2.7/randomForest'

The downloaded packages are in
 /tmp/RtmpT83A76/downloaded_packages
Warning message:
In install.packages() :
  installation of package 'randomForest' had non-zero exit status

-- 
View this message in context: 
http://www.nabble.com/Another-failed-attempt-to-install-an-R-package-in-Ubuntu-tp18415721p18415721.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] txt file, 14000+ rows, only last 8000 appear

2008-06-09 Thread IAIN GALLAGHER
I'm pretty sure I've had similar problems to this in the past and the problem 
has been a badly formatted text file ie you think it's all tab delimited but 
it's not.

I can't be more specific (because it's your file) but it would be worth copying 
the non appearing rows into a separate file and see if you can track down the 
region of the file that's causing the problem.

Cheers

Iain

RobertsLRRI [EMAIL PROTECTED] wrote: 

You are asked to follow the posting guide and to provide commented, 
minimal, self-contained, reproducible code.

Code entered:
Influenza-read.delim(C://Documents and
Settings//rroberts//desktop//H5N1//0v8a1n3.dat.txt)

Influenza

rows
6505
...
14421


So the first 6504 rows aren't displayed

Thank you
-- 
View this message in context: 
http://www.nabble.com/txt-file%2C-14000%2B-rows%2C-only-last-8000-appear-tp17701519p17735926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.